Skip to content
Gathers machine learning and data science techniques for problem solving.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Markov
NLP
R-vs-Python
Suggestion-Engine
big-query
english-text-normalization
image-processing refactor and add more notebooks Nov 10, 2018
network-study
preprocessing
signal-processing
stacking
stochastic-study
visualization
LICENSE
README.md
pickle_function.py

README.md

Machine-Learning-Data-Science-Reuse

Gathers machine learning and data science techniques for problem solving.

Warning

THIS REPOSITORY WILL LACK OF COMMENT, LACK OF DOCUMENTATION AND LACK OF STORY TELLING. PURPOSELY FOR SELF-REUSE.

Most of visualizations are self-explained, and at-least required basic understanding in statistics and python.

Some of visualization will not able to visualize because Github not able to render specific libraries that are using svg based, so please run it on any machine to see the results.

Why Genie? Because he can solved anything!

Table of contents

R vs Python

  1. CSV, Data Manipulation, Visualization

Preprocessing

  1. Handle missing values
  2. Rescaling (log, vector normalization, standardization, min-max scaling, boxcox)
  3. Features understanding
  4. Detecting outliers
  5. Encoding type comparison

Natural Language Processing

  1. Bag Of Word
  2. TF-IDF
  3. Hashing algorithm
  4. Models gathering (Bayes, SVM, XGB, LightGBM)
  5. sklearn pipeline
  6. N-gram
  7. Topic Modelling
  8. Naive-Bayes-SVM on hate speech
  9. Black panther visualization using wordclouds, semantic and kmean similarity network
  10. Semantic similarity on Malaysia hot topics

Suggestion Engine using Nearest-Euclidean and Gaussian Distribution

  1. Anime
  2. Game
  3. Movie
  4. Kickstarter projects

Image processing

  1. Augmentation (flip, rotate, shifting, zoom, shear, channel shift, grayscale, contrast, saturation)
  2. RGB subdivide
  3. hog-featuring
  4. image segmentation, nucleus
  5. K Nearest Neighbors on PCA / NMF
  6. SVD study on nearest neighbors
  7. Image wrapping to full A4

Signal processing

  1. Blurring on 1D Signal (loop, and FFT)
  2. Blurring on 2D Signal (loop)
  3. Conv 2 signals
  4. Pass-filter for freqs
  5. Signal smoothing
  6. Signal cross-correlation
  7. Augmentation (pitching, speed, distribution noise, shifting, silent shifting)
  8. Featuring (mfcc, log-energy, feature cube, power spectrum)

Stacking

  1. binary
  2. regression
  3. multi-classes
  4. stack multiple models from sklearn regressor with XGB

Stochastic study

  1. Cryptocurrencies correlation
  2. Predict crpytocurrencies multiple stack
  3. Simple stock analysis
  4. ARIMA for flight prediction
  5. TESLA market study

Big-query

  1. integrate big-query with Pandas Python
  2. Medicare queries with plotly visualization

Network study

  1. graph nodes for a person most spoke to whom
  2. Spooky social network analysis
  3. Taxi nodes analysis
  4. Stackoverflow tags analysis
  5. donald trump news social network
  6. najib razak twitter social network

Visualization

  1. Geographic using basemap
  2. Folium map and time analysis
  3. Israel graph visualization
  4. Israel political landscape
  5. Distribution age vs type for library
  6. Growth study for library
  7. botnet attack analysis
  8. Plotly geo-mapping 101
  9. Plotly bombing mapping visualization
  10. Easy plotly using cufflink
  11. Plotly pokemon data
  12. Rare visualization
  13. Dynamic map visualization using plotly and folium
  14. Kaggle 2018 Report

Markov

  1. Independent variables on weather forecast
  2. Dependent variables on text dataset
  3. Shakespeare character-wise generator

English-text normalization

  1. normalized texts (Dates, Measure, Decimals, Cardinals, Electronic - URL, Currency - Dollars, Telephone Numbers)
  2. normalized texts (Cardinal, Digit, Ordinal, Letters, Address, Telephone, Electronic, Fractions, Money)
You can’t perform that action at this time.