100daysofML

I am using a open source platform to showcase my work and motivate others to work along with me.

Day1: 27th Nov

Mobile Price Classification

Day2: 28th Nov

XGBoost Link1 Link2 How to use XGBoost with Python

Day3: 30th Nov

Pytorch Udacity Introduction Lesson 2 (Videos 1 to 25 complete)

Day4: 1st Dec

Study about Accuracy Paradox

Day5: 2nd Dec

Classification Algorithms study: KNN, SVC, K-medoid

Day6: 4th Dec

Studied About Apriori Algorithm Association rule, How is it different from collaborative filtering? Studied about example of market basket analysis.

Apriori Introduction Apriori vs Collaborative filtering

Day7: 5th Dec

Studied about Spectral Clustering

Day8: 6th Dec

Studying Natural Language Processing: CFG, CNF, CYK CYK tells whether a given sentence can be generated from a given Content free grammer given.

Chomsky Normal Form is conversion from CFG to CNF. In CNF we have productions of form. A-> BC or A->epsilon

Day9: 10th Dec

Cluster Algorithm KMeans, Heirarchical Clustering

Day10: 11th Dec

Google Crash Course ML

Day11: 12th Dec

Worked on Neural Style transfer Project and Watched PyTorch Udacity (Lecture 2) Working on Ben10 dataset

Day12: 13th Dec

PyTorch Udacity Lecture 2 continue

Day13: 15th Dec

Pytorch Udacity Lecture 2 continue

Day14: 16th Dec

Evaluation metric for Classification

Jaccard Index: JI = |Intersection| / |Union|
- JI close to 1 means more similarity
- JI close to 0 means less similarity
F1-Score = 2* Precision * Recall/ (Precision + Recall)
- 1 is Best and 0 is Worst
Log Loss: Output of Class Label is Probability instead of categorical.
- Log Loss measures the Performance of a classifier where the predicted output is a probability value between 0 or 1.
- Log Loss calculated by Log loss equation.
- Log Loss = (y * log(y predicted)) + (1-y) * (log(1 - (y predicted)))
- Average Log Loss = -1/n * summation((y * log(y predicted)) + (1-y) * (log(1 - (y predicted))))
- Lower Log Loss means Best Model and Higher Log Loss value means Poor Model.

Day15: 17th Dec

Working on Car Dataset Question.

Shuffle rows of Dataset
- np.random.shuffle(DataFrame.values)
Concat two dataframes
- df1
- df2
- frames = [df1,df2]
- result = pd.concat(frames,axis=1)
Rename Columns in Pandas
- df.rename(columns={'A':'a'},inplace=True)

Day16: 20th Dec

Worked on ZigWheels dataset
How qcut works?
- pd.qcut(dataset, precision=3, labels=['low','med','high'])
mean_squared_log_error
average_precision_score works on y_true binary and y_scores continous

Day17: 23rd Dec

Udacity PyTorch Lecture 2, neural network finished.

Day18: 24th Dec

Udacity Talk on PyTorch Lecture 3 finished.

Started Lecture 4, Pytorch

Single Layer Neural network
- features = torc.rand((1,5)) # createda (1,5) shape tensor
- Method 1 :y = activation(torch.sum(features*weights)+bias)
- Method 2: weights = weights.view(5,1) # used to reshape a tensor vector
  - y = activation(torch.mm(features,weights)+bias)

Day19: 27th Dec

Lecture 4 Started

Day20: 29th Dec

Lecture 4 Continue

Day21: 1st Jan

Lecture 4 Almost finished Learned how to Save Weights of a Trained Model.

Day22: 2nd Jan

Finished Lecture 4 and Started lecture 5 CNN chapter start watched videos till Video 14.

Day23/24/25: 4th Jan - 6th Jan

PyTorch Project on Google Colab Started.

Day26: 10th Jan

R programming Decision Tree, PCA, NaiveBayes, Linear Regression

Day27: 20th Jan

Artcile on Feature Selection

Difference B/W Covariance and Correlation

Feature Selection [ VERY IMPORTANT TOPIC ]:

Why is feature selection Important?

Training time increases exponentially with number if features.
Models have increasing risk of overfitting with increasing number of features.

Feature selection Techniques

Filter methods
Wrapper methods
Embedded methods

Filter methods

Filter method considers the relationship b/w features and the target variable to compute the importance of features.

Wrapper methods

Wrapper Methods generate models with a subsets of feature and gauge their model performances.

Embedded Methods

Feature selection by insights provided by some Machine Learning Model.

Day28: 24th Jan

Loss Functions:- Loss Functions

How to determine the value of K in clustering problems?

What causes overfitting, How to prevent it?

Overfitting causes: Too many features, High epochs training with validation loss, many hidden layers, Good Performance in training set and poor generalization in test set.
What is it?
Conceptual Explaination
Deep learning and overfitting

What is SVD(Singular value decomposition)?

Mainly used for matrix decomposition
Mostly used in Recommendation systems.
SVD link1
SVD link2
SVD MIT tutorial
BEST explaination SVD
Working and calculation Video

Day30: 17th Sept 2020

Learning about Ranking Problems with MCDA or MCDM.
Understanding how MCDM works with Research Paper LINK
Studied WSM, WPM, AHP, ELECTRE, TOPSIS and MOORA
Basics of MCDA Youtube Video

Day31: 18th Sept 2020

Study Comparative Analysis of MCDA techniques LINK1 | LINK2

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
PyTorch Study		PyTorch Study
ResourceImages		ResourceImages
Study Material		Study Material
README.md		README.md

shauryauppal/100daysofML

Folders and files

Latest commit

History

Repository files navigation