Skip to content

Some fundamental machine learning and data-analysis techniques are explained through realistic examples.

License

Notifications You must be signed in to change notification settings

khuyentran1401/Machine_Learning

 
 

Repository files navigation

Machine_Learning

This repo contains introduction to some of the most important machine learning and data-analysis techniques.

Filenames are preceded by DDMMYY. For descriptions and more check the Wiki Page.

PCA_Muller.py 190818: Principal component analysis example with breast cancer data-set.

270918: RidgeandLin.py, LassoandLin.py: Lasso and Ridge regression examples.

081018: bank.csv, data set of selling products of a portuguese company to random customers over phone call(s). Data-set description is available here.

161018: gender_purchase.csv, data-set of two columns describing customers buying a product depending on gender.

111118: winequality-red.csv, red wine data set, where the output is the quality column which ranges from 0 to 10.

121118: pipelineWine.py, A simple example of applying pipeline and gridsearchCV together using the red wine data.

24112018: lagmult.py, This program just demonstrate a simple constrained optimization problem using figures.

11122018: Consumer_Complaints_short.csv, 3 columns describing the complaints, product_label and category. Complete file can be obtained from Govt.data.

13122018: Text-classification_compain_suvo.py, Classify the consumer complaints data, which is already described above.

1912018: SVMdemo.py*, this program shows the effect of using RBF kernel to map from 2d space to 3d space. Animation requires ffmpeg in unix system.

05032019: IBM_Python_Web_Scrapping.ipynb, Deals with basic web scrapping, string handling, image manipulation.

06042019: datacleaning, Folder containing files and images related to data cleaning with pandas.

08062010: DBSCAN_Complete, Folder containing files and images related to application of DBSCAN algorithm to cluster Weather Stations in Canada.

13072019: SVM_Decision_Boundary, Pipeline + GridSearchCV were performed to find best-fit parameters for SVM and then decision function contours of SVM classifier for binary classification are plotted.

28122019: DecsTree, Folder contains notebook using a decision tree classifier on the Bank Marketing Data-Set.

07032020: Conjugate Prior, Folder contains a notebook where concept of conjugate prior is discussed including an introduction to PyMC3.

29052020: ExMax_Algo, Folder contains a notebook completely explaining the Expectation Maximization algorithm.

About

Some fundamental machine learning and data-analysis techniques are explained through realistic examples.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.8%
  • Python 0.2%