Open Data Profiling, Quality and Analysis on NYC OpenData dataset with semantic profiling using fuzzy ratio, Levenshtein distance and regex
-
Updated
Nov 10, 2020 - Jupyter Notebook
Open Data Profiling, Quality and Analysis on NYC OpenData dataset with semantic profiling using fuzzy ratio, Levenshtein distance and regex
A Kedro plugin that provides pandas dropin replacements for the pandas datasets (e.g modin and cuDF)
Simple example on how Modin can peed up your Pandas workflows by changing a single line of code
Recommendation system approaches
Using the MovieLens dataset with Surprise to compare different algorithms for rating prediction, and also create a movie recommendation system on top of it.
A transformation pipeline for Delta Lake using AWS SDK for Pandas
Delve deeper into data manipulation using Python's prominent libraries. Explore the functionalities of Pandas and get a glimpse of alternatives like Polars, Dask, and Modin.
HHA507 / Data Science / Assignment 2 / Data Manipulation
oneAPI Hackathon: The LLM Challenge
AI Starter Kit to generate structured synthetic data using Intel® Distribution of Modin
A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner
A Bioinformatics demo in Python working with FASTQ files and using the Modin library
Global Markets Options Pricing
A low-level execution library for analytic data processing.
Distributed XGBoost on Ray
Add a description, image, and links to the modin topic page so that developers can more easily learn about it.
To associate your repository with the modin topic, visit your repo's landing page and select "manage topics."