A Personal Work Notebook on Gitbook. 1)编程知识点总结;2)大数据场景下的用户数据解决方案实例
-
Updated
May 19, 2020 - HTML
A Personal Work Notebook on Gitbook. 1)编程知识点总结;2)大数据场景下的用户数据解决方案实例
Homework Notebooks of Art of Analyzing Big Data - The Data Scientist’s Toolbox.
Performed Big Data Analysis on Bundesliga Football League Dataset using tools PySpark, spark-SQL, and numpy and done in Jupyter Notebook.
This repository is a comprehensive collection of notebooks that covers various data science projects in detail. Each project is designed to provide a clear understanding of the data science pipeline, from data acquisition to model deployment.
In this jupyter notebook file, fictional data of football players was used to perform big data analytics in python. It involves using librarires such as pandas and matplotlib.
This is my final project for PSTAT 135, Big Data Analytics, using PySpark to conduct county-wide voter turnout regression analysis by demographic. This project was done in collaboration with Tyler Kim and Erasmo Rivas. The GCP storage bucket linked below contains the full project, while the Jupyter notebook and exported PDF are included here.
Data science encompasses a wide range of areas, topics, and sub-domains such as Big Data, Machine & Deep learning (ETL, TensorFlow, Keras), Data Mining/Visualization (EDA), BI, Predictive Analytics, Statistical Analytics, etc.
Add a description, image, and links to the big-data-analytics topic page so that developers can more easily learn about it.
To associate your repository with the big-data-analytics topic, visit your repo's landing page and select "manage topics."