Make sure you know how to do these tasks before joining the event:
- Import libraries like pandas, numpy, scikit-learn, matlibplot, seaborn
- Read csv or excel files
- Data preprocessing tasks like removing columns, checking missing values, finding outliers and correlations between variables. (Data cleaning, data transformation, and data reduction)
- Data visualization tasks like making histograms, boxplots, charts and maps (for 2 variables or more)
- Expedia: https://www.kaggle.com/competitions/expedia-hotel-recommendations/data?select=train.csv
- Job Market: https://www.kaggle.com/datasets/sl6149/data-scientist-job-market-in-the-us
- International football results from 1872 to 2022: https://www.kaggle.com/datasets/martj42/international-football-results-from-1872-to-2017
- Movies: https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset
- National Survey of Drug Use and Health (2015-2019): https://www.kaggle.com/datasets/bgallamoza/national-survey-of-drug-use-and-health-20152019
- Find goals/objectives from the problem or the question you are trying to find solution for.
- Make a quick introduction and present what is in the Data.
- Do some data analysis and data visualization like tables, plots and maps.
- Conclusion and findings to answer the question. (For example: find any correlations)
-
Pandas: https://pandas.pydata.org/docs/user_guide/index.html
-
Python for Data Analysis book: https://wesmckinney.com/book/
-
Data Analysis tutorials: https://www.geeksforgeeks.org/data-analysis-with-python/
-
Python
- Seaborn: https://seaborn.pydata.org/
- matplotlib: https://matplotlib.org/stable/tutorials/index.html
- Data Visualization in Python tutorial: https://www.simplilearn.com/tutorials/python-tutorial/data-visualization-in-python#:~:text=Python%20offers%20several%20plotting%20libraries,most%20simple%20and%20effective%20way
- Python charts tutorial: https://gilberttanner.com/blog/introduction-to-data-visualization-inpython/
-
Tableau
-
R
- R for data science book: https://r4ds.had.co.nz/data-visualisation.html
- ggplot book: https://ggplot2-book.org/index.html