data-transformation

Star

Here are 106 public repositories matching this topic...

mahmoud / glom

Star

☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️

python cli data dictionaries utilities declarative data-transformation nested-structures recursion apis

Updated Jan 12, 2025
Python

hi-primus / optimus

Star

🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

data-science machine-learning spark bigdata data-transformation pyspark data-extraction data-analysis data-wrangling dask data-exploration data-preparation data-cleaning data-profiling data-cleansing big-data-cleaning data-cleaner cudf dask-cudf

Updated Dec 2, 2024
Python

jupyter-naas / naas

Sponsor

Star

Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)

open-source data-science data binder ai integration jupyter pipeline etl engine data-transformation jupyterlab notebooks

Updated Feb 14, 2025
Python

mahmoudparsian / data-algorithms-with-spark

Star

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Updated Jun 26, 2023
Python

jim-schwoebel / allie

Star

🤖 An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.

machine-learning deep-learning data-transformation data-visualization machine-learning-library machine-learning-api datasets data-cleaning ludwig data-augmentation automl tpot machine-learning-models model-compression model-deployment autokeras voice-computing data-cleaning-pipeline autopytorch

Updated Sep 21, 2023
Python

bloomberg / pycsvw

Star

A tool to read CSV files with CSVW metadata and transform them into other formats.

csv rdf data-transformation csvw

Updated Apr 30, 2019
Python

kmatarese / glide

Star

Easy ETL

python data etl data-transformation pipelines data-processing dataframes dag parallel-processing

Updated Aug 12, 2022
Python

dreftymac / dynamic.yaml

Star

DEPRECATED: YAML-based data transformations

yaml data-transformation code-generation

Updated Oct 11, 2019
Python

VishanthSurresh / Spotify-Capstone-Project---Data-Engineering

Star

This repository is a working ETL framework which utilizes user data from Spotify API using ➲Python for Extraction and Transformation ➲SQL for Data Loading and Staging ➲Airflow for Data Orchestration and Monitoring ➲PowerBI for Reporting

scheduling data-transformation data-visualization orchestration api-call etl-pipeline data-loading data-modelling

Updated Apr 16, 2023
Python

bagher / fast-resource

Star

fast-resource is a data transformation layer that sits between the database and the application's users, enabling quick data retrieval. It further enhances performance by caching data using Redis and Memcached.

python redis flask memcached django cache data-transformation fastapi

Updated May 15, 2023
Python

cybersader / jsonaut

Star

GUI and library made to flatten HUGE JSON files. A library and utility for exploring, analyzing, and flattening JSON files of any size (LARGE - GBs) into CSVs, along with CSV transformations, dynamic CSV filtering, and all with low memory utilization.

python json gui csv etl data-transformation pandas data-engineering awesome-list data-integration data-pipelines csv-export json-to-csv json-flattener huge-data-files pyqt5-gui

Updated Jun 7, 2023
Python

bennyaustin / pyspark-utils

Star

Reusable Python classes that extend open source PySpark capabilities. Examples of implementation is available under notebooks of repo https://github.com/bennyaustin/synapse-dataplatform

apache-spark data-transformation pyspark azure-databricks azure-synapse-analytics synapse-spark azure-synapse-sparkpool

Updated Nov 1, 2024
Python

quantumudit / Insurance-Portfolio-Analysis

Star

This project focuses on analyzing and visualizing the insurance portfolio of an anonymous company that implemented an aggressive growth plan in 2021 across the counties of Florida using Python and Power BI

python etl jupyter-notebook data-transformation power-bi data-visualization data-analytics geospatial-analysis

Updated Dec 29, 2021
Python

CoDS-GCS / KGFarm

Star

A Holistic Platform for Automating Data Preparation

data-transformation feature-selection feature-engineering data-cleaning datapreparation

Updated Apr 19, 2024
Python

lykmapipo / Python-Spark-Log-Analysis

Star

Python scripts to process, and analyze log files using PySpark.

Updated Jul 13, 2024
Python

pillowTree3 / YouTube-Data-Harvesting-and-Warehousing-using-SQL-MongoDB-and-Streamlit

Star

This project is a powerful Streamlit application designed to provide users with seamless access and analysis of data from multiple YouTube channels. This intuitive tool leverages the Google API to retrieve a comprehensive range of information, including channel details, video statistics, and viewer engagement metrics.

mysql python mongodb data-transformation streamlit

Updated Jul 13, 2023
Python

quantumudit / Analyzing-WhiskyExchange-Whisky

Star

This project focuses on scraping data related to Japanese Whiskey from the Whiskey Exchange website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

python data-science etl jupyter-notebook data-transformation power-bi data-visualization data-analysis webscraping

Updated Dec 7, 2021
Python

aloftdata / vptstools

Star

Python library to transfer and convert vertical profile time series data

python data-transformation weather-radar aeroecology oscibio

Updated Dec 4, 2024
Python

alicjamazur / data-engineering-case

Star

ETL Redshift-based workflow automated with AWS Step Funtions.

aws sql aws-lambda data-transformation pyspark amazon-redshift aws-step-functions aws-glue etl-workflow

Updated Jan 7, 2021
Python

ezvezdov / Dataset-Wrapper

Star

NuScenes, Lyft, Waymo and a2d2 datasets parser.

python data-transformation dataset lyft lidar self-driving-car unification nuscenes waymo a2d2

Updated Aug 16, 2022
Python

Improve this page

Add a description, image, and links to the data-transformation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-transformation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-transformation

Here are 106 public repositories matching this topic...

mahmoud / glom

hi-primus / optimus

jupyter-naas / naas

mahmoudparsian / data-algorithms-with-spark

jim-schwoebel / allie

bloomberg / pycsvw

kmatarese / glide

dreftymac / dynamic.yaml

VishanthSurresh / Spotify-Capstone-Project---Data-Engineering

bagher / fast-resource

cybersader / jsonaut

bennyaustin / pyspark-utils

quantumudit / Insurance-Portfolio-Analysis

CoDS-GCS / KGFarm

lykmapipo / Python-Spark-Log-Analysis

pillowTree3 / YouTube-Data-Harvesting-and-Warehousing-using-SQL-MongoDB-and-Streamlit

quantumudit / Analyzing-WhiskyExchange-Whisky

aloftdata / vptstools

alicjamazur / data-engineering-case

ezvezdov / Dataset-Wrapper

Improve this page

Add this topic to your repo