#

etl

Here are 57 public repositories matching this topic...

jupyter-naas / naas

Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)

open-source data-science data binder ai integration jupyter pipeline etl engine data-transformation jupyterlab notebooks

Updated Jun 19, 2024
Python

superlinked / superlinked

A compute framework for turning complex data into vectors. Build multimodal vectors with ease and define weights at query time so you don't need a custom reranking algorithm to optimise results. Go straight from notebook to production with the same SDK.

python nlp natural-language-processing information-retrieval deep-learning etl retrieval ml embeddings vectorization semantic-search data-pipeline mlops vector-search llm retrieval-augmented-generation

Updated Jun 28, 2024
Jupyter Notebook

cedoula / Movies-ETL

Perform the Extract, Transform and Load (ETL) process to create a data pipeline on movie datasets using Python, Pandas, Jupyter Notebook and PostgreSQL.

python postgres json csv sql etl postgresql jupyter-notebook pandas pgadmin4 etl-framework etl-pipeline

Updated Oct 12, 2022
Jupyter Notebook

elasticlabs / airflow-jupyter-docker-compose

Orchestration of data science and earth observation models in Apache Airflow, scale-up with Celery Executor, experiment with jupyter notebook using a docker containers composition

data-science airflow etl jupyter-notebook apache-airflow airflow-dags

Updated Aug 23, 2022
Python

markwsutton / ETL-using-Python-SQL

ETL using Python in Jupyter Notebook, loading CSV, cleaning data, and saving to SQL Database.

python sql database etl csv-files

Updated Nov 17, 2020
Jupyter Notebook

dlt-with-debug

souvik-databricks / dlt-with-debug

A lightweight helper utility which allows developers to do interactive pipeline development by having a unified source code for both DLT run and Non-DLT interactive notebook run.

big-data spark etl python3 databricks dlt etl-pipeline big-data-processing delta-live-tables

Updated Dec 7, 2022
Python

cvilla87 / PySpark-ETL-Telecom

Jupyter Notebook showing how to process Telecom datasets using PySpark (SparkSQL and DataFrames) and plotting the results using Matplotlib.

python unix json csv spark hadoop etl jupyter-notebook pyspark hdfs sparksql matplotlib dataframe

Updated Dec 3, 2018
Jupyter Notebook

ramkumarpj / project-three

SEC Finance Data Engineering - ETL process for SEC Finance data of S&P 500 companies. Jupyter Notebooks to run ETL work flows. The final dataset is hosted in MongoDB Atlas(cloud). The API is written using Python with PyMongo and Flask libraries. The dashboards with charts are hosted in MongoDB Atlas.

python flask mongodb etl pymongo jupyter-notebook pandas data-engineering beautifulsoup extract-transform-load mongodb-atlas mongodb-atlas-cloud

Updated Mar 5, 2024
Jupyter Notebook

halpeter / ETL-Project

Using data extracted from Kaggle on the top restaurants from 2020, this project utilized Python scripting in Jupyter Notebook to transform and clean the data and finally, load the cleaned data frames into a PostgreSQL database.

etl extract transformations load

Updated Mar 29, 2021
Jupyter Notebook

EimisPacheco / Several-Jupyter-Notebooks

Jupyter Notebooks with different purposes: Social Network WebScrapping, ETL, Selenium WebDriver for Web Testing, Automation using Python, Data Wrangling, Data Transformation, Data Cleaning, Stock Market Analysis, APIs, Machine learning Algorithms, etc...

etl machine-learning-algorithms data-transformation data-wrangling data-cleaning stock-market-analysis selenium-python social-network-webscrapping

Updated Aug 9, 2020
Jupyter Notebook

sagarrathi / Projects

Various Data Analytics Projects based On Statistics in form of Notebook.

python data-science etl statistical-models

Updated Mar 3, 2020
HTML

BinariesGoalls / Udacity-Data-Engineering-Nanodegree

This is a repository to hold the files and notebooks produced throughout my Udacity's Nanodegree Data Engineering program.

python aws postgres airflow spark cassandra etl data-engineering data-pipelines data-modeling data-warehouses data-lakes

Updated Dec 5, 2022
PLpgSQL

waqarg2001 / PakWheels-Data-Analysis

ETL process and EDA was performed on used cars dataset scraped from PakWheels.com. The analysis was done through Jupyter Notebook. Insights were shared.

data-science data etl numpy exploratory-data-analysis logging pandas data-visualization python3 data-analytics data-analysis data-preprocessing pakistan webscrapping automobile

Updated Nov 12, 2022
Jupyter Notebook

anrobertson / Crowdfunding-ETL

Executed an ETL process from an excel file using Python, Jupyter Notebook, and SQL.

csv sql etl postgresql pandas data-analysis

Updated Jan 11, 2023
Jupyter Notebook

wednesday-solutions / aws-glue-jupyter-notebook-starter

A starter repository for your next AWS Glue project. This comes with complete IaC, a CD pipeline and a reusable common SDK. Set up jupyter notebook for AWS Glue locally

aws jupyter etl glue data-engineering de aws-glue jupyter-notbook

Updated Sep 6, 2023
Jupyter Notebook

I2DSR / data-science-ipython-notebooks

Data science encompasses a wide range of areas, topics, and sub-domains such as Big Data, Machine & Deep learning (ETL, TensorFlow, Keras), Data Mining/Visualization (EDA), BI, Predictive Analytics, Statistical Analytics, etc.

python data-science machine-learning data-mining r big-data deep-learning etl tensorflow exploratory-data-analysis keras data-visualization statistical-analysis business-intelligence predictive-analytics big-data-analytics

Updated May 3, 2024

mar1boroman / databricks-patterns

Common ETL patterns and utilities for PySpark. Notebooks tested on Databricks Community edition

data-science spark etl pyspark data-engineering databricks etl-framework cloud-migration databricks-notebooks databricks-email databricks-etl

Updated Sep 3, 2022
Jupyter Notebook

zhenghao0379 / py_etl

python etl demo by jupyter notebook

mysql python etl python3 azkaban

Updated May 22, 2020
Python

epomatti / az-databricks-etl

Sample notebooks on Azure Databricks for ETL

apache-spark etl azure terraform databricks synapse azure-databricks azure-synapse-analytics

Updated May 20, 2023
Scala

sarahrosegallagher / AWS_RDS_ETL

Jupyter Notebook ETL from AWS S3 bucket

etl aws-s3 jupyter-notebook data-analysis

Updated Jul 3, 2022
Jupyter Notebook

Improve this page

Add a description, image, and links to the etl topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the etl topic, visit your repo's landing page and select "manage topics."