Skip to content

HaleyKwok/Python_Libraries_for_Data_Analytics

Repository files navigation

Python Libraries for Data Analytics

📍 Mission

The mission of this repository is to provide beginners with an easy-to-follow guide on the most popular Python libraries used in data analytics. We aim to provide clear and concise explanations, as well as practical examples, of how to use these libraries to manipulate, analyze, and visualize data. Our goal is to empower individuals with the knowledge and skills necessary to succeed in the field of data analytics.


🔆 Introduction

Data analytics is a rapidly growing field, and Python has become a popular choice for data analysts and scientists due to its flexibility and ease of use. Python has a vast array of libraries specifically designed for data analytics, which can be used to manipulate, analyze, and visualize data.


🥳 Features

Matplotlib is a powerful plotting library that can be used to create a variety of charts and graphs. It provides a wide range of customization options to create professional-looking visualizations. Some of the popular types of charts that can be created using Matplotlib include scatter plots, line charts, histograms, bar charts, and pie charts.

NumPy is a fundamental library for scientific computing in Python, which provides powerful tools for working with arrays and matrices. NumPy arrays are much faster than regular Python lists for numerical operations, making them ideal for data analysis. NumPy arrays are used extensively in other data analysis libraries such as Pandas and Matplotlib.

Pandas is a library that provides high-performance, easy-to-use data structures and data analysis tools. It allows data to be manipulated and analyzed in a variety of ways, including filtering, merging, grouping, and reshaping. Pandas is designed to work with both small and large datasets and has a range of functions for reading data from a variety of sources such as CSV, Excel, SQL databases, and JSON.

A collection of some online materials.


📋 Notes

Markdown format
PDF format


🧾 Project

Bengaluru House Price Project

This data science project series will describe step by step the process of how to build a real estate price prediction website. We will first build a model using sklearn and linear regression, using the banglore house price dataset from kaggle.com. The second step is to write a python flask server to serve http requests using the saved model. The third part is a website built with html, css and javascript that allows the user to enter the house size, bedrooms, etc. It will call the python flask server to retrieve the predicted prices. During the model building process, we will cover almost all data science concepts such as data loading and cleaning, outlier detection and removal, feature engineering, dimensionality reduction, gridsearchcv for hyperparameter tuning, k-fold cross validation, etc. Technically and tool-wise, this project includes, among others.


📝 Changelog

  • [2023.08.09]: Update Matplotlib Notes.
  • [2023.08.30]: Update Numpy and Pandas Notes.
  • [2023.08.30]: Update Bengaluru House Price Project.
  • [2023.04.18]: Finalized the project details.

📭 Contact

If your have any comments or questions, feel free to contact kwokhinchi@gmail.com


📖 Acknowledgements

Python Libraries for Data Analytics is an open source project, contributed by our team. We thank all contributors who implemented their methods or added new features, as well as users who provided valuable feedback. We hope for further implementation and improvement of these systems.

Permission is hereby granted, free of charge, to any person obtaining a copy of this Repository and associated documentation files, to deal in the Repository without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Repository is furnished to do so.

The authors or copyright holders are not be liable for any claim, damages or other liabillty, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the Repository or the use or other dealings in the Repository.

📢 Disclaimer

This repository is for personal/research/non-commercial use only.


Copyright © Haley Kwok. All rights reserved.
Credit: Materials learned from @Codebasics

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages