R, Python and Mathematica Codes in Machine Learning, Deep Learning, Artificial Intelligence and NLP
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
pictures Stock Market Analysis - Dow Jones in R Feb 4, 2017
.gitignore Add files via upload May 8, 2017
1 - DATA for models DATA for models Jan 29, 2017
Data.zip Add files via upload Dec 16, 2016
MNIST_HOT.5.FULL.OK.pdf Add files via upload Jul 11, 2016
Mathematica - Artificial Intelligence Simulating Interactions in Social Networks Artificial Intelligence in Social Networks Jan 12, 2017
Mathematica - Facial Recognition in Movement Face Recognition in Movement Jan 12, 2017
Mathematica - Monte Carlo Simulation Markov Chain Monte Carlo - Self-Driving Car Jan 12, 2017
Mathematica - Social Network Surveillance Social Network Surveillance Jan 12, 2017
Python - Autoencoder MNIST.py Autoencoder for MNIST dataset using Keras Jan 28, 2017
Python - Autoencoder for Text Classification Autoencoder for Text Classification and PCA Jan 12, 2017
Python - Deep Learning with Lasagne Deep Learning in Lasagne Library Jan 12, 2017
Python - Face Recognition.py Face Recognition with OpenCV Jan 28, 2017
Python - Image Extraction from Twitter Image Extraction from Webpages (Twitter) Jan 12, 2017
Python - Keras Convolutional Neural Network in Keras Convolutional Neural Networks in Keras Jan 12, 2017
Python - Keras Deep Regressor Deep Learning using Keras Feb 3, 2017
Python - Keras LSTM Network LSTM Neural Network for Text Predicition Jan 12, 2017
Python - Keras LSTM Time Series Keras Neural Network LSTM - Time Series Prediction Jan 25, 2017
Python - Keras Multi Layer Perceptron Keras Multi Layer Perceptron Jan 29, 2017
Python - Machine Learning Principal Component analysis + Linear Regression Jan 12, 2017
Python - Machine Learning Algos 7 Machine Learning Algorithms and Hyperparameters Jan 19, 2017
Python - Maps Maps in Python using folium library Jun 15, 2017
Python - NLP Doc2Vec Natural Language Processing: Doc2Vec Jan 12, 2017
Python - NLP Semantic Analysis Natural Language Processing: Semantic Analysis Jan 12, 2017
Python - NLP Word2Vec Natural Language Processing: Word2Vec from scratch Jan 12, 2017
Python - Reinforcement Learning Reinforcement Learning based on Game Theory Payoff Jan 12, 2017
Python - Social Networks Social Network Visualization Jan 12, 2017
Python - Support Vector Machines Support Vector Machines Jan 12, 2017
Python - Theano Classification Theano Neural Networks Classification May 13, 2017
Python - Theano Deep Learning Deep Learning in Theano Jan 12, 2017
R - Churn of Customers Analyzing Customer Churn in R Jan 12, 2017
R - Customer Price Index Customer Price Index vs Purchase Power FRED Data Feb 3, 2017
R - Data Cleaning + Multinomial Regression Data Cleaning + Multinomial Regression Neural Net Jan 12, 2017
R - Face Recognition Face Recognition in R Jan 12, 2017
R - Geolocation Brazil Geolocation in R (Brazil map) Jan 12, 2017
R - Geolocation USA Geolocation in R (USA map) Jan 12, 2017
R - Geolocation World Geolocation World - Icon Customization Jan 25, 2017
R - Gradient Descent Logistic Gradient Descent for Logistic Regression Jan 24, 2017
R - H2O Deep Learning Deep Learning in R using H2O Jan 12, 2017
R - Imbalanced Classes Imbalanced Classes with Hill Climbing Algorithm Feb 15, 2017
R - Logistic + Boosting Logistic Regression + Gradient Descent + Boosting Jan 24, 2017
R - MNIST MNIST solution from scratch 96% accuracy in R Jan 12, 2017
R - Markov Chains Markov Chains in R Jan 12, 2017
R - NeuralNet Neural Networks for Classification in R Jan 12, 2017
R - Principal Components Analysis Principal Components Analysis Feb 5, 2017
R - Ridge Regularization R - Ridge Regularization Jan 20, 2017
R - Stock Market Candlestick Stock Market Analysis via Candlestick Feb 1, 2017
R - t-SNE t-SNE Dimensionality Reduction - Digits Dataset Feb 5, 2017
R- Deep Learning.R Deep Learning in R Using neuralnet Library Jan 28, 2017
Readme.md Guidelines for Models May 1, 2017
_config.yml Set theme jekyll-theme-hacker Jan 22, 2017


R, Python and Mathematica Codes in Data Science

Welcome to my GitHub repo.

I am a Data Scientist and I code in R, Python and Wolfram Mathematica. Here you will find some Machine Learning, Deep Learning, Natural Language Processing and Artificial Intelligence models I developed.

Outputs of the models can be seen at my portfolio: https://drive.google.com/file/d/0B0RLknmL54khdjRQWVBKeTVxSHM/view?usp=sharing

Mathematica Codes

MNIST_HOT.5.FULL: is a solution for the MNIST dataset in Mathematica, with 96.51% accuracy, based on difference of pixels.

Mathematica - Artificial Intelligence Simulating Interactions in Social Networks: is a model that simulates human interactions in a social network using cellular automata and agent-based modeling. Each agent has 3 possible choices for interation and a memory. The code has 14 pages with a big loop included in one line of code.

Mathematica - Facial Recognition in Movement: This code operationalizes facial recognition in a downloaded YouTube video. The output is also a video with the result of face recognition (YouTube link of the output is included in code page)

Mathematica - Monte Carlo Simulation: is an animated model of a Markov Chain Monte Carlo Simulation for autonomous driving. A video of the dynamic output was also generated and link for the YouTube video is included in code page.

Mathematica - Social Network Surveillance: is a model that tracks individuals in a social network, tracks also his connections and future interactions.

Python Codes

Keras version used in models: keras==1.1.0 | LSTM 0.2

Python - Autoencoder MNIST: is an autoencoder model for classification of images developed with Keras, for the MNIST dataset, with model Checkpoint as a callback to save weights.

Python - Autoencoder for Text Classification: is an autoencoder model for classification of text made with Keras, also with model Checkpoint.

Python - Deep Learning with Lasagne: is a deep neural network developed with Lasagne, where you can see values of weights in each layer, including bias.

Python - Face Recognition: is a model using OpenCV to detect faces.

Python - Image Extraction from Twitter: is a model that extracts pictures and their links from Twitter webpages, plotting with matplotlib.

Python - Keras Convolutional Neural Network: is a CNN developed to classify the MNIST dataset with an accuracy greater than 99%.

Python - Keras Deep Regressor: is a deep Neural Network for prediction of a continuous output made with Keras, learning rate scheduler according to derivative of error, random initial weights, with loss history.

Python - Keras LSTM Network: is a Recurrent Neural Network (LSTM) to predict and generate text.

Python - Keras Multi Layer Perceptron: is a MLP model, Neural Networks made with Keras with loss history, scheduled learning rate according to derivative of error for prediction and classification.

Python - Machine Learning: is a Principal Components Analysis followed by a Linear Regression.

Python - NLP Doc2Vec: is a Natural Language Processing model where I asked a Wikipedia webpage a question and 4 possible answers were semantically chosen from the tokenized and vectorized webpage, using KNN and cosine distance.

Python - NLP Semantic Analysis: is a Natural Language Processing model that classifies a given sentence according to semantic similarity to other sentences, using cosine distance.

Python - NLP Word2Vec: is a model developed from scratch to measure cosine similarity among words.

Python - Reinforcement Learning: is a model based on simple rules and Game Theory where agents attitude change according to payoff achieved. Can be adapted for tit-for-tat strategy, always cooperate, always defeat and other strategies. Rewards were placed in the payoff matrix.

Python - Social Networks: is a model that draws social networks configuration and connections.

Python - Support Vector Machines: is a Machine Learning model that classifies the Iris dataset with SVM and plots it.

Python - Theano Deep Learning: is a Neural Network with two hidden layers using Theano.

R Codes

R - Churn of Customers: is a model that uses a logistic regression associated with a threshold to predict which customers present the greater risk to be lost.

R - Data Cleaning + Multinomial Regression: is a model that presents data cleaning and a multinomial regression using package nnet to classify customers according to their level of loyalty.

R - Face Recognition: is a code to detect faces and objects in R.

R - Geolocation Brazil: is a file for geo-spatial localization, brazilian map.

R - Geolocation USA: is also a file for geo-spatial localization, USA map.

R - Geolocation World: is a file for geo-spatial localization, world map, zoom available, customizable icons.

R - Gradient Descent Logistic: is a model that performs a gradient descent to define a threshold for the sigmoid function in a Logistic Regression. Boosting was implemented and ROC curves compared.

R - H2O Deep Learning: is a Neural Network model developed to predict recommendations and word-of-mouth advertising.

R - Imbalanced classes is a model for employee churn, where features have no correlation with target variable and also there are imbalanced classes in the proportion 1/20. A logistic regression from scratch is applied, a hill climbing gradient is used to define the best threshold for the logistic function and after that, boosting was compared regarding AUC in a ROC plot.

Logistic Regression + Gradient Descent + Boosting is a model where features have no correlation with target variable. Logistic Regression with Gradient Descent was applied, and then Boosting.

R - MNIST: is a solution for the MNIST dataset, developed from scratch.

R - Markov Chains: is a simple visualization of Markov Chains and probabilities associated.

R - NeuralNet: is a Neural Network model developed to predict and classify word-of-mouth advertising.

R - Ridge Regression: is a model with Ridge Regularization made from scratch to prevent overfitting.

R - Deep Learning: is a Neural Network model with 2 hidden layers for prediction of a continuous variable.