PySpark (Python for Apache Spark) Essentials - Wrangling Big Data for Machine Learning.
-
Updated
Jun 27, 2023 - Jupyter Notebook
PySpark (Python for Apache Spark) Essentials - Wrangling Big Data for Machine Learning.
Data parsing with pandas in jupyter notebook.
Notebooks with some EDA and Machine Learning Models
Repo containing Jupyter notebooks with practice projects from DataCamp
Data Wrangling OpenStreetMap Milwaukee data with sqlite3 in iPython/Jupyter Notebook
Azure Databricks notebook sample to connect Blob Storage of Azure Log Analytics
Data wrangling of Austin OpenStreetMap data using Python, PyMongo, MongoDB, and Jupyter Notebooks.
This Jupyter notebook shows how average level of U.S. Senate representation varies with race and ethnicity.
Data analysis on bikeshare system using Python code under Jupyter Notebook to carry out EDA & visualization and Streamlit to develop interactive dashboard.
A classification approach to the machine learning Titanic survival challenge on Kaggle.Data visualisation, data preprocessing and different algorithms are tested and explained in form of Jupyter Notebooks
An analysis of European soccer/football from 2008 to 2016 using a dataset from Kaggle. I employ different data wrangling techniques to clean and filter the data using Python and Jupyter notebook.
This repository contains my data science projects/assignments completed for academic and self-learning. The projects/assignments are mostly in Jupyter notebook files. Please do contact me via LinkedIn if you are hiring a data scientist. 🤓
Jupyter Notebook that scrapes NBA player salary data from espn.com, injects that data into an existing dataset (one owned by TrueHoop CEO and CoFounder Henry Abbott), and exports the resulting augmented dataset to a .csv file.
Discover 'Data Science Complete Course' on GitHub - Python notebooks, datasets, and structured daily lessons (day1, day2, etc.). Dive into hands-on examples, exercises, and lectures for a comprehensive learning experience. Elevate your data science skills with this curated and organized repository. #DataScience #Python #GitHub
Build machine learning model to predict whether a house will sell or not based on a set of features. The results will be presented in the form of interactive widgets in jupyter notebook for technical audience that can be used to make informed decision about selling their properties.
In this Jupyter notebook, we're going to find out how some of the elements of a particular show show interact with each other. After exploring and cleaning our data a little, we're going to answer some basic questions like: What are the most extreme game outcomes? How does the game affect television viewership?
Jupyter Notebooks with different purposes: Social Network WebScrapping, ETL, Selenium WebDriver for Web Testing, Automation using Python, Data Wrangling, Data Transformation, Data Cleaning, Stock Market Analysis, APIs, Machine learning Algorithms, etc...
Contains labs in Jupyter Notebooks created as part of my IBM Data Science Professional Certificate
A Jupyter Notebook with the analysis for a Whatsapp Chat using several techniques of data wrangling, EDA and Sentiment Analysis
This repo features Jupyter Notebook labs for learning data analysis with Python. Explore data acquisition, wrangling, visualization, modeling, and evaluation. Enhance your skills in Python data analysis.
Add a description, image, and links to the data-wrangling topic page so that developers can more easily learn about it.
To associate your repository with the data-wrangling topic, visit your repo's landing page and select "manage topics."