Basic Machine Learning Introductory Documents
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
Collaborative Filtering Recommendation System update Dec 13, 2017
Ensemble update Dec 30, 2017
FFM update Aug 3, 2018
FM update Dec 30, 2017
Frcwp update Aug 3, 2018
GolVe_Classification add blog path Sep 25, 2018
Knowledge Summary fix bug Mar 20, 2018
Semantic recognition update Dec 3, 2017
Tsnewp update Jan 11, 2018
Youtube update Aug 3, 2018
data update Dec 30, 2017
gradient_descent update Dec 3, 2017
n-gram update Dec 11, 2017
smote update Dec 13, 2017
svd Create Dec 2, 2017
.DS_Store add split line Sep 27, 2018 MIT Feb 1, 2018 add split line Sep 27, 2018


This project implement classic machine learning algorithms(ML). Motivations for this project includes:

  • Helping machine learning freshman have a better and deeper understanding of the basic algorithms and models in this field.
  • Providing the real-life and commercial executing methods in ML filed.
  • Keeping my Mathematics Theory and Coding ability fresh due to such cases.



1.1 fastfm

Show how to use the package of fast_fm to classify the training data directly.

1.2 Fsfm


We rewrite fm by ourselves and focus helping people get deeper insights about FM.So we upload it to the pypi named Fsfm,you can downlode it if you're interested in it.


An interview problem in 'Nlp' solved by n-gram instead of Naive Bayes.



3.1 Matrix decomposition in linalg

3.2 Matrix decomposition with RSVD

4.Collaborative Filtering Recommendation System


4.1 Base of Item

4.2 Base of User

5.Semantic recognition


5.1 Jieba Process

5.2 Tf-Idf

5.3 Bp Neural Network

5.4 SVM process

5.5 Naive Bayes

5.6 RandomForest




7.1 Mean of the weight

7.2 Random scale in connected Vector



It means fast risk control with python.It's a lightweight tool that automatic recognize the outliers from a large data pool.




9.1 Data preprocessing before ensemble

9.2 Case showed by stacking xgboost and logistic regression

9.3 Case showed by stacking gbdt and logistic regression

9.4 Case showed by bagging xgboots or gbdts

9.5 How to use the trained stacking model during the online module


T-distributed stochastic neighbor embedding(t-SNE) rewrite with Python by ourselves, it's a good dimensionality reduction method. Add many explanation among the code.

Package download address.

More test data.

11.Knowledge Summary

Some questions for the new hand to estimate their level of the ML、DL. What's more ,it also contains the key point which i think during my study with Andrew Ng's machine learning lessons(to be continued).

Also, I write some words to the new hand. Read it 写给想转行机器学习深度学习的同学 if you're interested in it .


Following the paper 'Deep Neural Networks for YouTube Recommendations' , finished with Python.


@bolg:关于'Deep Neural Networks for YouTube Recommendations'的一些思考和实现


See More From:


More you may follow with interest :FM部分||deepFM部分


See More From:


More you may follow with interest :Youtube构造skn Vector||N-Grams


Python Environment. More details getting from single project requirement.


If you find some incorrect content, i'm so sorry about that. PLS contact me by the following way: