Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)
-
Updated
Jun 19, 2024 - Python
Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)
A compute framework for turning complex data into vectors. Build multimodal vectors with ease and define weights at query time so you don't need a custom reranking algorithm to optimise results. Go straight from notebook to production with the same SDK.
Perform the Extract, Transform and Load (ETL) process to create a data pipeline on movie datasets using Python, Pandas, Jupyter Notebook and PostgreSQL.
Orchestration of data science and earth observation models in Apache Airflow, scale-up with Celery Executor, experiment with jupyter notebook using a docker containers composition
A lightweight helper utility which allows developers to do interactive pipeline development by having a unified source code for both DLT run and Non-DLT interactive notebook run.
SEC Finance Data Engineering - ETL process for SEC Finance data of S&P 500 companies. Jupyter Notebooks to run ETL work flows. The final dataset is hosted in MongoDB Atlas(cloud). The API is written using Python with PyMongo and Flask libraries. The dashboards with charts are hosted in MongoDB Atlas.
Using data extracted from Kaggle on the top restaurants from 2020, this project utilized Python scripting in Jupyter Notebook to transform and clean the data and finally, load the cleaned data frames into a PostgreSQL database.
Jupyter Notebooks with different purposes: Social Network WebScrapping, ETL, Selenium WebDriver for Web Testing, Automation using Python, Data Wrangling, Data Transformation, Data Cleaning, Stock Market Analysis, APIs, Machine learning Algorithms, etc...
Various Data Analytics Projects based On Statistics in form of Notebook.
This is a repository to hold the files and notebooks produced throughout my Udacity's Nanodegree Data Engineering program.
ETL process and EDA was performed on used cars dataset scraped from PakWheels.com. The analysis was done through Jupyter Notebook. Insights were shared.
Executed an ETL process from an excel file using Python, Jupyter Notebook, and SQL.
A starter repository for your next AWS Glue project. This comes with complete IaC, a CD pipeline and a reusable common SDK. Set up jupyter notebook for AWS Glue locally
Data science encompasses a wide range of areas, topics, and sub-domains such as Big Data, Machine & Deep learning (ETL, TensorFlow, Keras), Data Mining/Visualization (EDA), BI, Predictive Analytics, Statistical Analytics, etc.
Common ETL patterns and utilities for PySpark. Notebooks tested on Databricks Community edition
Sample notebooks on Azure Databricks for ETL
Jupyter Notebook ETL from AWS S3 bucket
Add a description, image, and links to the etl topic page so that developers can more easily learn about it.
To associate your repository with the etl topic, visit your repo's landing page and select "manage topics."