Machine_Learning

This repository contains some of the most important machine learning and data-analysis techniques.

When the new files will be added, corresponding description will also be added with file name and DDMMYY.

base of these programs are machine learning codes influenced from Muller's Machine Learning with Python book. Later on many extra techniques are implemented.

PCA_Muller.py 190818: Principal component analysis example with breast cancer data-set. Detailed description of this code is discussed in Towards Data Science.

270918: RidgeandLin.py, LassoandLin.py: Lasso and Ridge regression examples: From coefficient shrinakge in Ridge to feature selection in Lasso are shown in the code. The concepts and discussion of the results are described here.

081018: bank.csv, data set of selling products of a portuguese company to random customers over phone call(s). Detailed description are available here.

161018: gender_purchase.csv, data-set of two columns describing customers buying a product depending on gender.

111118: winequality-red.csv, red wine data set, where the output is the quality column which ranges from 0 to 10.

121118: pipelineWine.py, Contains a simple example of applying pipeline and gridsearchCV together using the red wine data. More description can be found here.

24112018: lagmult.py, this program just demonstrate a simple constrained optimization problem using figures. Uses Lagrange Multiplier method.

11122018: Consumer_Complaints_short.csv, 3 columns describing the complaints, product_label and category. Complete file can be obtained from Govt.data. File size is around 650 MB. More details about the usage of this file will be uploaded soon when the text classification program is ready.

13122018: Text-classification_compain_suvo.py, Classify the consumer complaints data, which is already described above. The file deals with the complete data-set (650 MB). After testing several ML algorithms, Linear SVM works best. Higher the computer resources, higher amount of rows can be considered for TfidfVectorizer.

1912018: SVMdemo.py, this program shows the effect of using RBF kernel to map from 2d space to 3d space. Animation requires ffmpeg in unix system.

05032019: IBM_Python_Web_Scrapping.ipynb, Deals with basic web scrapping, string handling, image manipulation while we generate fake cover for our band.

06042019: datacleaning, Folder containing files and images related to data cleaning with pandas. For more details check Medium.

09062019: DBSCAN_Complete, Folder containing files and images related to application of DBSCAN algorithm to cluster Weather Stations in Canada. Apart from ususal Scikit-learn, numpy, pandas, I have used Basemap to show the clusters on a map. More details can be found in Medium.

13072019: SVM_Decision_Boundary, I set up a pipeline with StamdardScaler, PCA, SVM and, performed grid-search cross-validation to find best-fit parameters, using which the decision function contours of SVM classifier for binary classification are plotted. Read in TDS for more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine_Learning

This repository contains some of the most important machine learning and data-analysis techniques.

When the new files will be added, corresponding description will also be added with file name and DDMMYY.

base of these programs are machine learning codes influenced from Muller's Machine Learning with Python book. Later on many extra techniques are implemented.

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
DBSCAN_Complete		DBSCAN_Complete
SVM_Decision_Boundary		SVM_Decision_Boundary
datacleaning		datacleaning
Consumer_Complaints_short.csv		Consumer_Complaints_short.csv
IBM_Python_Web_Scrapping.ipynb		IBM_Python_Web_Scrapping.ipynb
LassoandLin.py		LassoandLin.py
PCA_Muller.py		PCA_Muller.py
README.md		README.md
RidgeandLin.py		RidgeandLin.py
SVMdemo.py		SVMdemo.py
Text-classification_Complain_Suvo.py		Text-classification_Complain_Suvo.py
bank.csv		bank.csv
fakeCover3_Web_Scrap.png		fakeCover3_Web_Scrap.png
gender_purchase.csv		gender_purchase.csv
lagmult.py		lagmult.py
pipelineWine.py		pipelineWine.py
winequality-red.csv		winequality-red.csv

yzion/Machine_Learning

Folders and files

Latest commit

History

Repository files navigation

Machine_Learning

This repository contains some of the most important machine learning and data-analysis techniques.

When the new files will be added, corresponding description will also be added with file name and DDMMYY.

base of these programs are machine learning codes influenced from Muller's Machine Learning with Python book. Later on many extra techniques are implemented.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages