Data science tools and stat learning

This repository contains projects showcasing my skills in machine learning, data science, and statistical learning techniques. The projects include feature extraction from pathology images and regression tree optimization, among others. Below is a summary of each project and the tools used.

Project 1: Feature Extraction and Clustering

Description: This project involves using deep learning techniques to extract features from pathology images. The task is part of an assignment where the objective is to generate features from a set of training images using a deep network. The features are then allocated with 10% of the data for validation.

Tools and Technologies:

Deep Learning Framework (PyTorch)
Image Processing Libraries (OpenCV, PIL)
NumPy, Pandas for data handling
Clustering Techniques (K-Means, Hierarchical Clustering)

Key Files:

Assignment1.ipynb: Jupyter Notebook containing the code for feature extraction from pathology images.
code.ipynb: Supplementary code for handling specific tasks related to feature generation.

Project 2: Regression Tree Optimization and Classification

Description: This project is focused on performing a regression task, with a significant emphasis on decision tree optimization. The tasks include data preprocessing, tuning regression trees to prevent overfitting, comparing regression trees with Random Forest and Support Vector Regression (SVR), and performing a classification task by adding a threshold to the label column.

Tools and Technologies:

Scikit-learn for machine learning models (Decision Trees, Random Forest, SVR)
Data Preprocessing Libraries (Pandas, NumPy)
Metrics for Model Evaluation (RMSE, Accuracy, Precision, Recall)

Key Files:

A2.ipynb: Jupyter Notebook containing code for data preprocessing, regression tree optimization, comparison with Random Forest and SVR, and classification tasks.
Problem_Statement.txt: Detailed description of the assignment requirements and tasks.

How to Use

Clone the repository to your local machine:

git clone https://github.com/yourusername/your-repo-name.git

Navigate to the project directory:
```
cd your-repo-name
```
Open the Jupyter Notebooks in your preferred environment (e.g., JupyterLab, Google Colab).
Follow the instructions within each notebook to reproduce the results.

Additional Notes

Make sure to have the required Python libraries installed. You can install the necessary dependencies by running:
```
pip install -r requirements.txt
```

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
Features_and_Clustering		Features_and_Clustering
Final_Project		Final_Project
Logistic_Regression_assignment		Logistic_Regression_assignment
Regression-Random_forest		Regression-Random_forest
Semi-Supervised_Learning		Semi-Supervised_Learning
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data science tools and stat learning

Project 1: Feature Extraction and Clustering

Project 2: Regression Tree Optimization and Classification

How to Use

Additional Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

SeivenBell/Data_Science_and_Statistical_Learning_Projects

Folders and files

Latest commit

History

Repository files navigation

Data science tools and stat learning

Project 1: Feature Extraction and Clustering

Project 2: Regression Tree Optimization and Classification

How to Use

Additional Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages