This repository contains a collection of Python scripts covering fundamental data science, probability, and machine learning concepts. The codebase has been refactored for clarity and adherence to professional PEP8 formatting standards.
The repository is organized by topics and modules, encompassing:
- Core Python Concepts: Functions, classes, and object-oriented programming.
- Data Manipulation: Extensive use of
pandasandnumpyfor vectorized data formatting and mathematics. - Data Visualization: Creating actionable charts and plots.
- Statistical Modeling: Advanced multivariate analyses utilizing
scikit-learnandstatsmodels. - System Performance: Benchmarking and iterative brute force efficiency scripts.
To run these scripts locally, ensure you have Python 3 installed. It is recommended to use a virtual environment.
- Clone the repository:
git clone https://github.com/boakyejeff/first_python_stats.git cd first_python_stats - Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
- Install dependencies:
pip install -r requirements.txt
Dependencies are tracked in requirements.txt. Key libraries include:
pandasnumpyscikit-learnscipystatsmodelsmatplotlib/seaborn
MIT License.