Caution: This timeline is tailored for @EshbanTheLearner and might not be suitable for everyone.
Today's Progress: Today I continued the Stanford CS224N: NLP with Deep Learning course by Christopher Manning on YouTube.
Description: This includes the following:
- Lecture 7 - Vanishing Gradient, Fancy RNNs
- Vanishing Gradient
- Intuition
- Vanishing Gradient
Today's Progress: Today I continued the Stanford CS224N: NLP with Deep Learning course by Christopher Manning on YouTube.
Description: This includes the following:
- Lecture 6 - Language Models and RNNs
- n-gram Language Model
- Sparsity Problem
- Neural Language Model
- Window-based Neural Model
- Recurrent Neural Networks
- Advantages and Disadvantages
- Training RNN Language Model
- Backpropagation for RNNs
- Evaluating Language Models
- Perplexity
- Applications
- n-gram Language Model
Today's Progress: Today I continued the Stanford CS224N: NLP with Deep Learning course by Christopher Manning on YouTube.
Description: This includes the following:
- Lecture 5 - Dependency Parsing
- Syntactic Structure: Consistency and Dependency
- Context Free Grammers (CFGs)
- Prepositional Phrase Attachment Ambiguity
- Coordination Scope Ambiguity
- Adjectival Modifier Ambiguity
- Verb Phrase Attachment Ambiguity
- Dependency Grammer
- Universal Dependencies Treebanks
- Dependency Conditioning Preferences
- Bilexical Affinities
- Dependency Distance
- Intervening Material
- Valency of Heads
- Transition Based Dependency Parsing
- MaltParser - Nivre and Hall 2005
- Neural Dependency Parsing
- Model Architecture
- Syntactic Structure: Consistency and Dependency
Today's Progress: Today I continued the Stanford CS224N: NLP with Deep Learning course by Christopher Manning on YouTube.
Description: This includes the following:
- Lecture 4 - Backpropagation
- Matrix Gradients for Simple Neural Network
- Computation Graphs
- Backpropagation
- Upstream Gradient
- Local Gradient
- Downstream Gradient
- Automatic Differentiation
- Regularization
- Overfitting
- Vectorization
- Nonlinearities
- Initialization
- Optimizers
- Learning Rates
Today's Progress: Today I continued the Stanford CS224N: NLP with Deep Learning course by Christopher Manning on YouTube.
Description: This includes the following:
- Lecture 3 - Neural Networks
- Classification Review/Introduction
- Softmax Classifier
- Cross-Entropy Loss
- Neural Network Introduction
- Intuition
- Classification Difference with Word Vectors
- Matrix Notations for a Layer
- Non-Linearities
- Named Entity Recognition
- NER on Word Sequences
- Binary Word Window Classification
- Computing Gradients
- Jacobian Matrix
- Classification Review/Introduction
Today's Progress: Today I continued the Stanford CS224N: NLP with Deep Learning course by Christopher Manning on YouTube.
Description: This includes the following:
- Lecture 2 - Word Vectors and Word Senses
- Optimization Basics
- Gradient Descent
- Stochastic Gradient Descent
- Word2Vec Review
- Skip-Grams
- Continuous Bag of Words
- Negative Sampling
- Unigram Distribution
- Can we capture this essence more effectively by counting?
- Co-Occurence Matrix
- Window
- Full Document
- Problems and Solutions
- Singular Value Decomposition (SVD)
- Co-Occurence Matrix
- The GloVe Model of Word Vectors
- Count Based vs Direct Prediction
- Encoding Meaning in Vector Differences
- Log-Bilinear Model with Vector Differences
- GloVe
- Evaluating Word Vectors
- Intrinsic
- Word Vector Analogies
- Human Judgement
- Correlation Judgement
- Extrinsic
- Intrinsic
- Optimization Basics
Today's Progress: Today I started the Stanford CS224N: NLP with Deep Learning course by Christopher Manning on YouTube.
Description: This includes the following:
- Lecture 1 - Introduction and Word Vectors
- Human Language
- Word Meaning
- Denotational Semantics
- WordNet - Advantages and Disadvantages
- Distributional Semantics
- Word2Vec
- Denotational Semantics
- Word2Vec Introduction
- Skip-Grams
- Continuous Bag of Words
- Word2Vec Objective Function Gradients
- Optimization Basics
- Gradient Descent
- Stochastic Gradient Descent
- Looking at Word Vectors
YouTube | Stanford CS224N: NLP with Deep Learning
Today's Progress: Today I concluded the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Calculating Expected Loss
- Data Prep
- Train-Test Split
- Dummy Variables
- Estimate Recovery Rate
- Total Expected Loss on Portfolio Level
- Data Prep
- Summary
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Exposure at Default (EAD) Model
- Data Prep
- Train-Test Split
- Selecting Reference Categories
- Model Training
- Linear Regression
- EAD Model Validation
- Data Prep
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Loss Given Default (LGD) Model
- Data Prep
- Train-Test Split
- Preparing Inputs
- Training the LGD Model
- Logistic Regression
- Testing the LGD Model
- Accuracy of LGD Model
- ROC-AUC of LGD Model
- Saving the LGD Model
- Stage 2 - Multiple Linear Regression
- Data Prep
- Training
- Testing
- Correlation
- Mean Squared Error
- Combining Stage 1 and Stage 2 Model
- Data Prep
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Applying PD Model for Decision Making
- Calculating Probability of Default for a single Customer
- Creating a Scorecard
- Calculating Credit Score
- From Credit Score to PD
- Setting Cut-Offs
- Loss Given Default (LGD) and Exposure at Default (EAD) Models
- Independent Variables
- Dependent Variables
- Distribution of Recovery Rates and Credit Conversion Factors
- Beta Distribution
- Beta Regression
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- PD Model Validation
- Out of Sample Validation
- Model Evaluation
- Accuracy
- Recall and Precision
- ROC-AUC
- Gini Coefficient and Kolmogorov-Smirnov Statistic
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Probability of Default (PD) Model Estimation
- Logistic Regression with Dummy Variables
- Logistic vs Linear
- Odds
- Interpretting Coefficients of Logistic Regression
- Logistic Regression with P-Values
- Interpretting PD Model Coefficients
- Logistic Regression with Dummy Variables
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Probability of Default (PD) Model Data Prep
- Data Prep
- Preprocessing Continuous Variables - II
- Creating Dummy Variables
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Probability of Default (PD) Model Data Prep
- Data Prep
- Preprocessing Continuous Variables - I
- Creating Dummy Variables
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Probability of Default (PD) Model Data Prep
- Data Prep
- Weights of Evidence
- Information Value
- Automating Calculations
- Visualizing Results
- Visualizing and Interpreting Weight of Evidence
- Data Prep
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Probability of Default (PD) Model Data Prep
- Intro to PD Model
- Dependant Variables
- Default Definition
- Logistic Regression
- Interpretability
- Dependant Variable
- Information Value
- Weight of Evidence
- Data Prep
- Train-Test Split
- Intro to PD Model
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- General Preprocessing
- Basic EDA
- Dealing with Continous Variables
- Dealing with Discrete Variables
- Dealing with Missing Values
Today's Progress: Today I continued with the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Setting up the Working Environment
- Data Description
- Lending Club Loan Data
- Dependant and Independant Variables
- Discrete and Continuous Variables
- Fine and Coarse Classing
Today's Progress: Today I enrolled in the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Introduction
- Capital Adequacy / Regulatory Capital
- Capital Adequacy Ratio
- Basel II Accord
- Minimum Capital Requirements
- Credit Risk
- Standardized Approach
- Internal Ratings Based Approaches
- Foundation Internal Based Approache (F-IRB)
- Advanced Internal Based Approache (A-IRB)
- Operational Risk
- Market Risk
- Credit Risk
- Supervisory Review
- Market Discipline
- Minimum Capital Requirements
- Different Facility Types
- Credit Risk Modeling Approaches
Today's Progress: Today I enrolled in the Credit Risk Modeling in Python 2020 course on Udemy.
Description: This includes the following:
- Introduction
- What is Credit Risk?
- Creditor
- Debtor
- Credit Limit
- Interest
- Home Ownsership and Asset Financing
- Credit Risk
- Default Event
- The Global Financial Crisis 2008
- Expected Loss and its Components
- Types of Factors for Expected Loss
- Borrower-Specific Factors
- The Economic Environment
- Expected Credit Loss
- Probability of Default
- Loss Given Default
- Exposure at Default
- Types of Factors for Expected Loss
- What is Credit Risk?
Udemy | Credit Risk Modeling in Python 2020
Today's Progress: Today I concluded the Natural Language Processing with Sequence Models course from the Natural Language Processing Specialization.
Description: This includes the following:
- Siamese Networks
- Computing the Cost - I
- Computing the Cost - II
- Mean Negative
- Closest Negative
- Hard Negative Mining
- One Shot Learning
- Training/Testing
- Programming Assignment
- Question Duplicates
Certificate | Natural Language Processing with Sequence Models
Today's Progress: Today I continued with the Natural Language Processing with Sequence Models course from the Natural Language Processing Specialization.
Description: This includes the following:
- Siamese Networks
- Introduction
- Architecture
- Identical Subnetworks
- Cosine Similarity
- Cost Function
- Anchor
- Positive
- Negative
- Triplets
- Simple Loss
- Non-Linearity
- Alpha Margin
Today's Progress: Today I continued with the Natural Language Processing with Sequence Models course from the Natural Language Processing Specialization.
Description: This includes the following:
- LSTMs and Named Entity Recognition
- Introduction to Named Entity Recognition
- Applications of NER
- Search Engine Efficiency
- Recommendation Engines
- Customer Service
- Automatic Trading
- Training NERs: Data Processing
- Computing Accuracy
- Programming Assignment
- Named Entity Recognition (NER)
- Introduction to Named Entity Recognition
Today's Progress: Today I continued with the Natural Language Processing with Sequence Models course from the Natural Language Processing Specialization.
Description: This includes the following:
- LSTMs and Named Entity Recognition
- RNNs and Vanishing Gradients
- Advantages vs Disadvantages
- Exploding Gradients
- Identity RNN with ReLU Activation
- Gradient Clipping
- Skip Connections
- Introduction to LSTMs
- Basic LSTM Structure
- Applications of LSTMs
- Understanding LSTMs
- LSTM Architecture
- The Forget Gate
- The Input Gate
- The Output Gate
- RNNs and Vanishing Gradients
Today's Progress: Today I continued with the Natural Language Processing with Sequence Models course from the Natural Language Processing Specialization.
Description: This includes the following:
- N-Grams vs Sequence Models
- Cost Function for RNNs
- Cross Entropy Loss
- Implementation Notes
- Abstraction in Frameworks
tf.scan()
function
- Gated Recurrent Units
- Reset Hidden Gates
- Update Hidden Gates
- Vanilla RNNs vs GRUs
- Deep and Bi-Directional RNNs
- Programming Assignments
- Deep N-grams
- Cost Function for RNNs
Today's Progress: Today I continued with the Natural Language Processing with Sequence Models course from the Natural Language Processing Specialization.
Description: This includes the following:
- N-Grams vs Sequence Models
- Traditional Language Models
- Large Corpus Requirements
- Large Space and RAM Requirements
- Recurrent Neural Networks
- Basic Structure
- Advantages
- Applications of RNNs
- One to One
- One to Many
- Many to One
- Many to Many
- Math in Simple RNNs
- Traditional Language Models
Today's Progress: Today I continued with the Natural Language Processing with Sequence Models course from the Natural Language Processing Specialization.
Description: This includes the following:
- Neural Networks for Sentiment Analysis
- Trax: Layers
- Classes
- Subclasses
- Instances
- Dense and ReLU Layers
- Serial Layer
- Other Layers
- Embedding Layer
- Mean Layer
- Training
- Gradients by
grad()
- Gradients by
- Programming Assignment
- Sentiment Analysis with Deep Neural Networks
- Trax: Layers
Today's Progress: Today I enrolled in the Natural Language Processing with Sequence Models course from the Natural Language Processing Specialization.
Description: This includes the following:
- Neural Networks for Sentiment Analysis
- Introduction
- NN for Sentiment Analysis
- Neural Network Structure
- Forward Propagation
- Initial Representation
- Trax: Neural Networks
- Trax Highlights
- Advantages of Trax
- Why Trax?
- Makes programmers efficient
- Runs code fast
Coursera | Natural Language Processing with Sequence Models
Today's Progress: Today I completed the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Conversion of Different Problems to SDP - II
- Conic Quadratic Programming via SDP
- Schur Complement
- Barrier Method for Conic Programming
- Barrier Aggregate
- Examples of Barriers
- Matrix Functions
- Eigenvalue Decomposition
- Gradient of Trace of Matrix Function
- Gradient of log det A
- Gradient of log det Barrier
- Conic Quadratic Programming via SDP
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Conversion of Different Problems to SDP - I
- Lemma of Schur Complement
- Minimize Maximal Eigenvalue of Symmetric Matrix
- Linear Matrix Approximation
- Expression of Linear Programming via SDP
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Conic Programming I
- Duality in Conic Programming
- Dual Conic Problem
- Weak Duality Theorem
- Strong Conic Duality and Complementarily Slackness
- Example
- Dual SDP Problem
- Minimax Problem
- Chebyshev Approximation
- Complex Valued Chebyshev Approximation
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Conic Programming I
- Examples of Cones
- R^n+
- Lorenz Cone
- Cone of Positive Semidefinite Matrices
- Conic Programming Problems
- Semidefinite Programming
- Dual Cone
- Self-Dual
- Examples of Cones
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Minimax Theorem, Game Theory and Lagrange Duality
- Game Interpretation of Minimax
- Saddle Point Theorem
- Minimax of Lagrangian
- Weak Duality
- Dual Problem and Weak Duality
- Strong Duality
- Slayter Condition for Strong Duality
- Examples of Dual Problems
- Quadratic Program
- Linear Program
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Lagrange Multipliers via Penalty Method
- Active vs Non-Active Constraints
- Penalty Function for Equality Constraints
- First Order Necessary Optimal Conditions
- Barrier Method
- Augmented Lagrangian Method
- Penalty-Multiplier Function
- Algorithm
- Augmented Lagrangian for Equality Constraints
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Summary of Unconstrained Optimization
- 1-D Methods
- Golden Section - No Derivatives
- Bisection
- Quadratic/Cubic Interpolation Methods
- Inexact Line Search
- Backtracking Method
- Armijjo Rule
- Multidimensional Optimizations
- Steepest Descent
- Newton
- Gauss-Newton
- Conjugate Gradient
- Truncated Newton
- Quassi-Newton
- BFGS
- Sequential Subspace Optimization
- Nelder-Mead Simplex Method
- 1-D Methods
- Constrained Optimization
- Lagrangian
- Karush-Kuhn-Tucker First Order
- Necessary Optimality Conditions
- Penalty Function Method
- Penalty Aggregate
- Ideal Penalty Aggregate
- Algorithm
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Sequential Subspace Optimization
- Fast Subspace Optimization
- Quasi-Newton Method
- Approximate Newton Direction
- Approximating Hessian
- Secant Equation
- Sherman-Morison Formula
- Broyden Family of Quasi-Newton
- BFGS - Broyden, Fletcher, Goldfarb, Shanno
- DFP - Davidon, Fletcher, Powell
- Initialization and Convergence Properties
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Conjugate Gradient Method - Part 2
- Derivation of Conjugate Gradient Method
- Exact Line Search
- Gram-Schmidt for Q-Orthogonality
- Properties
- Simplification
- Polak-Ribiere Method
- Fletcher-Reevs Method
- Conjugate Gradient Method Summary
- Convergence Rate of Conjugate Gradient Method
- Preconditioning
- Truncated Newton's Method
- Neewton System
- Compute Resource Analysis for product of
- Function
- Gradient
- Hessian Vector
- Derivation of Conjugate Gradient Method
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Conjugate Gradient Method - Part 1
- Scalar Product
- Gram-Schmidt Orthogonalization
- Q-Conjugate/Q-Orthogonal Directions
- Minimization of a Quadtratic Function
- Conjugate Direction Method
- Expanding Manifold Property
- Affine Subspace
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Newton's Method
- For Non-linear Equations
- Modified Newton's Method
- Enforcing Descent Direction
- Solving Symmetric System of Equations
- Cholesky Factorization
- Modified Cholesky Decomposition
- Least Squares Problem
- Gauss-Newton Method
- Levenberg-Marquardt Method
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Multidimensional Unconstrained Optimization Methods
- Line Search Methods
- Directional Derivative
- Choice of Step Size
- Exact Line Search
- Inexact Line Search - Armijo Rule
- Constant Step Size
- Diminishing Step Size
- Steepest/Gradient Descent
- Linear Convergence Rate
- Newton's Method
- Derivation
- Trust Region
- Asymptotic Quadratic Convergence
- Line Search Methods
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Optimality Conditions
- Convex vs Non-Convex Functions
- Sufficient Optimality Conditions
- One-Dimensional Optimization
- Bisection Method
- Golden Selection Method
- Quadratic Interpolation
- Superlinear Convergence
- Cubic Interpolation
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Local and Global Minima
- Definition
- Norms
- Convex Functions and Minima
- Proof by Contradiction
- Optimality Conditions
- Proof by Gradient Inequality
- Non-Convex Functions and Minima
- Sufficient Optimality Condition
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Convex Sets & Functions
- Definition of Set and Function
- Properties of Convex Sets
- Properties of Convex Functions
- Extended Value Convex Functions
- Epigraph
- Properties of Epigraph
- Convex Combination
- Convex Hull
- Jensen Inequality
- Gradient Inequality
- Second Derivative
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Derivates of Multivariate Functions
- Total Differential
- Gradient
- External Definition of Gradient
- Directional Derivative
- Hessian
- Second Directional Derivative
- Example - Gradient and Hession of:
- Linear Operator
- Quadratic Functions
- Taylor Expansion
- Function of Matrices
- Gradient of Function of a Matrix
- Example
- Gradient of a Neural Network
- Total Differential
Today's Progress: Today I continued with the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Linear Algebra Refresh
- N-Dimensional Euclidean Space
- Linear Subspace
- Affine Subspace
- Vector Norm
- Euclidean Norm
- Matrix Norm
- Frobenius Norm
- Induced Matrix Norm
- Inner Product
- Eigenvalue Decomposition
- Matrix Polynomials and Functions
- Positive (Semi) Definite Symmetric Matrices
Today's Progress: Today I started the Algorithms for Non-Linear Optimization course by Michael Zibulevsky.
Description: This includes the following:
- Unconstrained Optimization
- Continous
- Smooth
- Constrained Optimization
- Inequality and Equality Constraints
- Optimality Conditions
- Numerical Iterative Methods
- Parametric Regression
- Linear Regression
- Nonlinear Regression
- Neural Networks
YouTube | Introduction to Optimization
Today's Progress: Today I concluded the Bayesian Machine Learning in Python: A/B Testing course on Udemy.
Description: This includes the following:
- Bayesian A/B Testing
- Thompson Sampling
- Online Nature of Bayesian A/B Testing
- Finding Threshold without p-Value
- Summary
- Classical A/B Testing and Drawbacks
- Bernoulli Distributed Data
- Methods that adapt to Data Collected so far
- How to Solve Explore-Exploit
- Excercises
Today's Progress: Today I continued with the Bayesian Machine Learning in Python: A/B Testing course on Udemy.
Description: This includes the following:
- Bayesian A/B Testing
- Exploit vs Explore Dilema
- Multi-Armed Bandit
- Reinforcement Learning
- Epsilon-Greedy
- UCB1
- Chernoff-Hoeffding Bound
- Upper Confidence Bound
- Conjugate Priors
- Beta Mean, Variance
- Exploit vs Explore Dilema
Today's Progress: Today I continued with the Bayesian Machine Learning in Python: A/B Testing course on Udemy.
Description: This includes the following:
- Traditional A/B Testing
- Problem Setup
- Hypotheses
- Null Hypothesis
- Alternative Hypothesis (One-Sided)
- Alternative Hypothesis (Two-Sided)
- Hypotheses
- Test Statistic
- p-Value
- Testing Characteristics
- Pooled Standard Deviation
- Welch's t-Statistic
- Non-Parametric Tests
- Kolmogorov-Smirnov Test
- Kruskal-Wallis Test
- Mann-Whitney U test
- Chi-Square Test Statistic
- Yates Correction
- Fisher's Exact Test
- Benferroni Correction
- Pairwise Testing
- One-vs-Rest Test
- Post Hoc Testing
- Statistical Power
- Pitfalls of Traditional A/B Testing
- Problem Setup
Today's Progress: Today I continued with the Bayesian Machine Learning in Python: A/B Testing course on Udemy.
Description: This includes the following:
- Bayes Rule and Probability Review
- Marginal Distributions
- Joint Distribution
- Conditional Distribution
- Discrete vs Continuous Random Variables
- Bayes' Rule
- Independence
- The Gambler's Fallacy
- The Monty Hall Problem
- Maximum Likelihood Estimation
- The Bernoulli Distribution
- The Gaussian Distribution
- Unbiased Estimate of the Co-Variance Matrix
- Confidence Intervals
- Cumulative Distribution Function
- Confidence Interval Approximation
- Bernoulli Confidence Approximation
- The Bayesian Paradigm
Today's Progress: Today I enrolled in the Bayesian Machine Learning in Python: A/B Testing course on Udemy.
Description: This includes the following:
- Introduction and Course Outline
- Real World Examples of A/B Testing
- Medicine
- Website
- Local Business Flyers
- What is Bayesian Machine Learning
- Bayesian vs Frequentist Approach
- Sampling
- Bayesian Networks
- Latent Dirichlet Allocation (LDA) Algorithm
Important Links
Udemy | Bayesian Machine Learning in Python: A/B Testing
Today's Progress: Today I completed the Unsupervised Machine Learning Hidden Markov Models in Python course on Udemy.
Description: This includes the following:
- Discrete HMMs using Deep Learning Libraries
- Gradient Descent
- Discrete HMM in Tensorflow
- HMMs for Continuous Observations
- Gaussian Mixture Models with Hidden Markov Models
- Generating Data from a Real-Valued HMM
- Continuous HMM in Tensorflow
- HMMs for Classification
- Generative vs Discriminative Classifiers
- HMM Classification on Poetry Data
- Parts-of-Speech Tagging
- PoS Tagging Concepts
- PoS Tagging with an HMM
Today's Progress: Today I continued the Unsupervised Machine Learning Hidden Markov Models in Python course on Udemy.
Description: This includes the following:
- Hidden Markov Models for Discrete Observations
- From Markov Models to Hidden Markov Models
- Latent Variables
- HMMs are Doubly Embedded
- How to choose the number of Hidden States
- K-Fold Cross-Validation
- The Forward-Backward Algorithm
- The Viterbi Algorithm
- The Baum-Welch Algorithm
- The Expectation-Maximization Algorithm
- Lagrange Multipliers
- Baum-Welch Updates for Multiple Observations
- The Underflow Problem and its Solution
- Viterbi (Applying Log)
- Scaling Forward
- Scaling Backward
- Implementation of Discrete HMM in Python
- From Markov Models to Hidden Markov Models
Today's Progress: Today I enrolled in the Unsupervised Machine Learning Hidden Markov Models in Python course on Udemy.
Description: This includes the following:
- Introduction
- Intro to Hidden Markov Models
- Common Use Cases for HMMs
- Unsupervised vs Supervised
- Markov Models
- The Markov Property
- Markov Models
- Transition Probabilities
- Initial State Distribution
- Maximum Likelihood
- Smooth Estimates
- The Math of Markov Chains
- Stationary Distributions
- Limiting Distribution
- Markov Models: Examples, Problems and Applications
- Sick vs Healthy
- SEO and Bounce Rate Optimization
- 2nd-Order Language Model
- Python Implementation
- Eminem Style Rap Generation
- Google's Page Rank Algorithm
- Perron-Frobenius Theorem
Today's Progress: Today I completed the Smart Analytics, Machine Learning, and AI on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 1
- Productionizing Custom ML Models
- Phases of ML Projects
- Ways to do Custom ML on GCP
- Kubeflow
- AI Hub
- Lab 3: Running AI Models on Kubeflow
- Setting up Kubeflow on a Kubernetes Engine Cluster
- Packaging a Tensorflow Program in a Container and Uploading it to Google Container Registery
- Submitting a TF-train Job and Save the Resulting Model to Google Cloud Storage
- Serving and Interacting with a Trained Model
- Productionizing Custom ML Models
- Week 2: Module 2
- BigQuery ML
- BigQuery ML for Quick Model Building
- Classification, Regression, and Recommender Models
- Unsupervised ML with Clustering Models
- Lab 4: Predict Bike Trip Duration with a Regression Model in BQML
- Querying and Exploring the London Bicycles Dataset for Feature Engineering
- Creating a Linear Regression Model in BQML
- Evaluate the Performance of your ML Model
- Extracting your Model Weights
- Lab 5: Movie Recommendations in BigQuery ML
- Training a Recommendation Model in BigQuery
- Making Product Predictions for Both Single Users and Batch Users
- BigQuery ML
- Week 2: Module 3
- Cloud AutoML
- Why AutoML?
- Auto ML Vision
- Auto ML NLP
- Auto ML Tables
- Cloud AutoML
Certificate | Smart Analytics, Machine Learning, and AI on GCP
Today's Progress: Today I enrolled in the Smart Analytics, Machine Learning, and AI on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 1: Module 1
- Introduction
- Analytics and AI
- What is ML?
- Machine Learning and AI
- ML Options on GCP
- Reviewing Key ML Concepts
- Prebuilt ML Model APIs
- Unstructured Data
- Lab 1: Using NL API to Classify Unstructured Text
- Creating a Natural Language API Request
- Calling the API with curl
- Using the NL API's Text Classification Feature
- Using Text Classification to Understand a Dataset of News Articles
- Week 1: Module 2
- Cloud AI Platform Notebooks
- What's a Notebook
- BigQuery Magic and Ties to Pandas
- Lab 2: BigQuery in Jupyter Labs on AI Platform
- Instantiating a Jupyter Notebook on AI Platform
- Execute a BigQuery query from within a Jupyter and Processing the Output using Pandas
- Cloud AI Platform Notebooks
Coursera | Data Engineering with Google Cloud Professional Certificate
Coursera | Smart Analytics, Machine Learning, and AI on GCP
Today's Progress: Today I completed the Building Resilient Streaming Analytics Systems on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 3
- Bigquery: Advance Funcionality
- GIS Functions
- WITH Clauses vs Permanent Tables
- Analytical Window Functions
- Ranking Functions + ARRAYs
- Lab 5: Optimizing BigQuery Queries for Performance
- Using BiqQuery to:
- Minimizing I/O
- Caching Results of Previous Queries
- Avoiding Overwhelming Single Workers
- Using Approximate Aggregation Functions
- Using BiqQuery to:
- Bigquery: Advance Funcionality
- Week 2: Module 4
- Performance Considerations
- I/o
- Shuffle
- Grouping
- Materialization
- Functions and UDFs
- Lab 6: Creating Date-Partitioned Tables in BigQuery
- Querying a Partioned Dataset
- Creating Dataset Partitions to Improve Query Performance and Reduce Cost
- Performance Considerations
Certificate | Building Resilient Streaming Analytics Systems on GCP
Today's Progress: Today I continued with the Building Resilient Streaming Analytics Systems on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 1
- Streaming into BigQuery
- Streaming
- Visualizing Results
- Lab 3: Streaming Analytics and Dashboards
- Connecting to a BigQuery data source
- Creating reports and charts to visualize BigQuery data
- Streaming into BigQuery
- Week 2: Module 2
- Streaming into Cloud BigTable
- High-Throughput Streaming with Cloud Bigtable
- Optimizing Cloud Bigtable Performance
- Lab 4: Streaming Data Pipelines into Bigtable
- Launching Dataflow pipeline to read from Pub/Sub and writing into Bigtable
- Opening an HBase shell to query the Bigtable database
- Streaming into Cloud BigTable
Today's Progress: Today I continued with the Building Resilient Streaming Analytics Systems on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 1: Module 2
- Cloud Dataflow Capabilities for Streaming Data
- Streaming Data Challenges
- Cloud Dataflow Windowing
- Lab 2: Streaming Data Pipelines
- Launching Dataflow and Running a Dataflow Job
- Understanding how Data Elements Flow through the Transformations of a Dataflow Pipeline
- Connecting Dataflow to Pub/Sub and BigQuery
- Observing and Understanding how Dataflow Autoscaling adjusts Compute Resources to Process Input Data Optimally
- Learning Where to find Logging Information Created by Dataflow
- Explore Metrics and Create Alerts and Dashboards with Stackdriver Monitoring
- Cloud Dataflow Capabilities for Streaming Data
Today's Progress: Today I enrolled in the Building Resilient Streaming Analytics Systems on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 1: Module 1
- Introduction
- Processing Streaming Data
- Cloud Pub/Sub
- Introduction to Pub/Sub
- Cloud Pub/Sub Push vs Pull
- Publishing with Pub/Sub Code
- Lab 1: Publish Streaming Data into Pub/Sub
- Creating a Pub/Sub Topic and Subscription
- Simulating your Traffic Sensor Data into Pub/Sub
Coursera | Data Engineering with Google Cloud Professional Certificate
Coursera | Building Resilient Streaming Analytics Systems on GCP
Today's Progress: Today I completed the Building Batch Data Pipelines on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 4
- Aggregating with GroupByKey and Combine
- Lab 4: MapReduce in Cloud DataFlow
- Identifying Map and Reduce Operations
- Executing the Pipeline
- Using Command Line Parameters
- Week 2: Module 5
- Side Inputs and Windows of Data
- Lab 5: Practicing Pipeline Side Inputs
- Trying out a BigQuery query
- Exploring the pipeline code
- Executing the pipeline
- Week 2: Module 6
- Cloud Dataflow Templates and SQL
- Creating and Reusing Pipeline Templates
- Cloud Dataflow SQL Pipelines
- Cloud Dataflow Templates and SQL
Certificate | Building Batch Data Pipelines on GCP
Today's Progress: Today I continued with the Building Batch Data Pipelines on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 3
- Running Batch Processing Pipelines on Cloud Dataflow
- Cloud Dataflow
- Why Customers value Dataflow
- Building Cloud Dataflow Pipelines in Code
- Key Considerations with Designing Pipelines
- Transforming Data with PTransforms
- Lab 3: Dataflow Pipeline
- Setting up a Python Dataflow project using Apache Beam
- Writing a simple pipeline in Python
- Executing the query on the local machine
- Executing the query on the cloud
- Running Batch Processing Pipelines on Cloud Dataflow
Today's Progress: Today I continued with the Building Batch Data Pipelines on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 2
- Cloud Composer
- Orchestrating work b/w GCP Services with Cloud Composer
- Apache Airflow
- DAGs and Operators
- Workflow Scheduling
- Monitoring and Logging
- Lab 3: Cloud Composer
- Using GCP Console to create the Cloud Composer environment
- Viewing and run the DAG (Directed Acyclic Graph) in the Airflow web interface
- Viewing the results of the wordcount job in storage.
- Cloud Composer
Today's Progress: Today I continued with the Building Batch Data Pipelines on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 1
- Cloud Data Fusion
- Introduction
- Components of Data Fusion
- Building a Pipeline
- Exploring Data using Wrangler
- Lab 2: Cloud Data Fusion
- Connecting Cloud Data Fusion to a couple of data sources
- Applying basic transformations
- Joining two data sources
- Writing data to a sink
- Cloud Data Fusion
Today's Progress: Today I enrolled in the Building Batch Data Pipelines on GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 1: Module 1
- EL, ELT, ETL
- Refresher
- Quality Considerations
- Validity
- Accuracy
- Completeness
- Consistency
- Uniformity
- BigQuery for ELT
- Shortcomings of ELT
- ETL for Data Quality Issues
- DataProc
- DataFlow
- DataFusion
- Data Catalog
- EL, ELT, ETL
- Week 1: Module 2
- Executing Spark on Cloud Dataproc
- The Hadoop Ecosystem
- Running Hadoop on Cloud Dataproc
- GCS instead of HDFS
- Optimizing Dataproc
- Optimizing Dataproc Storage
- Optimizing Dataproc Templates and Autoscaling
- Optimizing Dataproc Monitoring
- Lab 1: Running Apache Spark Jobs on Cloud Dataproc
- Migrating existing Spark Jobs to Cloud Dataproc
- Modify Spark Jobs to use Cloud Storage instead of HDFS
- Optimize Spark Jobs to run on Job Specific Clusters
- Executing Spark on Cloud Dataproc
Coursera | Data Engineering with Google Cloud Professional Certificate
Coursera | Building Batch Data Pipelines on GCP
Today's Progress: Today I completed the Data Science: From Prediction to Production on Udemy.
Description: This includes the following:
- The Importance of Well Written Code
- Respect Your Code
- Coding Standards
- Production vs Research Code
- Meet Uncle Bob
- Clean Code by Robert C. Martin
- Data Sciene Clean Code
- Advance Topics in Predictive Modeling
- Three Factors which Impact Accuracy
- Reduce Noise Variance
- Better Features
- Better Representation
- Reduce Estimators Variance
- Increase Sample
- Increase Predictors Variance
- Decrease Predictors Correlation
- Reduce Noise Variance
- The Best Black-Box Model
- How to Use Dummies the Right Way
- The Price of Wrong Feature Set
- Omitting Relevant Variable
- Including Irrelevant Variable
- The Impact of Measurements Errors
- Errors that Create Bias
- Errors that don't Create Bias
- Heteroskedasticity Illness
- Three Factors which Impact Accuracy
Today's Progress: Today I continued with the Data Science: From Prediction to Production on Udemy.
Description: This includes the following:
- How to Plan the Development
- The Path from Concept to Production
- Theoretical Framework
- High Level Design
- Iteration 0
- Dry Runs
- Live Tests
- How to Measure you Progress
- Breakthroughs
- Solid Theoretical Framework
- Start Live Tests
- Successful Live Tests
- Breakthroughs
- Scrum or Kanban
- How to Make Time Assessments
- Complexity Risk
- Technology Risk
- Tasks Risk
- The Path from Concept to Production
Today's Progress: Today I continued with the Data Science: From Prediction to Production on Udemy.
Description: This includes the following:
- What Makes You Professional
- Your Main Responsibility
- Building Applications
- How to Deliver?
- Learning from Software Developers
- Skills You Must Have
- Data Science Skills
- Analytical Skills
- Business Accumen
- Modeling Skills
- Statistics
- Machine Learning
- Good Sense about Data
- Common Sense
- Familiarity with Data Science Tools and Technologies
- Delivering Skills
- Deliver Fast
- Respond Quickly to Changes
- Build Robust Models
- Data Science Skills
- Developer or Researcher
- Your Main Responsibility
- Guidelines for Delivering Fast Results
- Mistakes as a Junior
- Six Important Principles
- Satisfy the Customer
- Welcome Changing Requirements
- Deliver Working Software Frequently
- Promote Sustainable Development
- Technical Excellence
- Simplicity
- Satisfy the Customer
- Understand the Customer's Business
- Deliver Early Valuable Software
- Open the Black Box
- Communicate Frequently
- Do a Soft Launch
- Simplicity is Essential
- "Start Lean, Thicker Later"
- "We Build Applications, Not Fancy Models"
- Practical Perspective about Scale
Today's Progress: Today I started the Data Science: From Prediction to Production on Udemy.
Description: This includes the following:
- Practical Perspective about Predictions
- Why Prediction is the Wrong Term
- Single Point Prediction is Useless
- Prediction is about Knowing the Distribution of the Future
- Models without the Appropriate Evaluation are Useless
- The Characteristicss of a Good Prediction
- Different
- Model Type
- Feature List
- Expressiveness results in Different Error Distribution
- Properties of Good Prediction
- Stable Variance
- Light Tailed Variance
- Symmetry
- Known Distribution
- Different
- Why Prediction is the Wrong Term
- Guidelines for Selecting Models
- Why Linear Models are Great
- Stable Variance
- Simple
- Efficient
- Easy to Interpret
- Suitable for First Iteration
- When to Use Nonlinear Models
- Direct Relation is Clearly not Linear
- Many Features
- Srtuctural Phenomenon
- The Risk of Nonlinear Modeling
- Overfitting
- Variance Instability
- Heavy Tailed Distribution
- Corner Solutions
- Terrible Performance
- Risk Management in Nonlinear Modeling
- Ensemble Methods
- Narrowing the Feature Space
- Specific Tuning Parameters
- Why Linear Models are Great
Important Links:
Udemy | Data Science: From Prediction to Production
Today's Progress: Today I studied about Reinforcement Learning for Stock Trading
Description: This includes the following:
- Intro to Reinforcement Learning
- Deep Q-Learning
- Defining States
- Time Series (Fixed Window)
- Cash in Hand
- Buying Current Stocks to Buy Better Stocks
- Defining Actions
- Buy/Sell/Hold Stock
- For N stocks, 3^N possibilities
- How many stocks to sell/buy?
- Simplified Actions
- Ignore Transaction Costs
- Avoid Knapsack Problem
- Buy Multiple Stocks in Round Robin Fashion
- Sell before Buy
- Defining Rewards
- Portfolio Value
- Minimal Trading Bot Implementation in Tensorflow 2.0
Today's Progress: Today I studied about Recommendation Systems
Description: This includes the following:
- Introduction
- Content Based Methods
- Collaborative Filtering Methods
- Model Based
- Matrix Facorization
- Memory Based
- Item Centered Bayesian Classifier
- User Centered Linear Regression
- Model Based
- Hybrid Methods
- Evaluation of Recommendation Systems
- Metric Based Evaluation
- Human Based Evaluation
- Deep Learning for Recommendation Systems
- Embeddings
- Simple Movie Recommender System (TensorFlow 2.0)
Important Links:
Article 1 | Introduction to Recommender Systems
Article 2 | Recommender Systems in Practice
Article 3 | Recommender Systems with Deep Learning Architectures
Today's Progress: Today I completed the Bash Scripting, Linux and Shell Programming Complete Guide on Udemy.
Description: This includes the following:
- Bash Scripting
- Bash File Structure
- Echo Command
- Comments
- Variables
- Strings
- Loops
- While
- For
- Until
- Break & Continue
- User Input
- Conditional Statements
- Case Statements
- Command Line Arguments
- Functions
- Global vs Local Variables
- Arrays
- Shell & Environment Variables
- Scheduled Automation
- Aliases
- Wildcards
- Multiple Commands
Today's Progress: Today I continued with the Bash Scripting, Linux and Shell Programming Complete Guide on Udemy.
Description: This includes the following:
- Users
- Run Commands as Superuser
- Change User
- Show Effecter User and Group IDs
- Killing Programs and Logging Out
- Kill A Running Command
- Kill All Processes By a Name
- Logging Out of Bash
- Shortcuts
- No More Input
- Clear Screen
- Zoom In
- Zoom Out
- Moving the Cursor
- Deleting Text
- Fixing Typos
- Cutting and Pasting
- Character Capitalization
Today's Progress: Today I continued with the Bash Scripting, Linux and Shell Programming Complete Guide on Udemy.
Description: This includes the following:
- Getting Help
- Show Manual Description
- Search Manual
- Reference Manuals
- Working with Files/Folders
- Creating a Folder
- Creating a File
- Copy Files/Folders
- Move and Rename File/Folders
- Delete Files/Folders
- Delete Empty Folders
- Change File Permission
- Text Files
- File Concatenation
- File Perusal Filter
- Terminal Based Text Editor
Today's Progress: Today I started the Bash Scripting, Linux and Shell Programming Complete Guide on Udemy.
Description: This includes the following:
- Introduction
- Bash/Shell
- Terminal
- Shell
- Console
- Navigation
- Listing Folder Contents
- Print Current Folder
- Change Folder
- Using a Stack to Push Folders
- Check File Type
- Find File By Name & Update Locate Database
- Find a Command
- Show Command History
Important Links: Udemy | Bash Scripting, Linux and Shell Programming Complete Guide
Today's Progress: Today I completed the Modernizing Data Lakes and Data Warehouses with GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 2
- BigQuery as a Data Warehousing Solution
- Exploring Schemas
- Schema Design
- Nested and Repeated Fields
- Lab 4 (BigQuery: JSON and Array Data):
- Loading semi-structured JSON into BigQuery
- Creating and querying arrays
- Creating and querying structs
- Querying nested and repeated field
- Partitioning and Clustering in BigQuery
- Optimizing with Partitioning and Clustering
- Creating Partitioned Tables
- Partitioning and Clustering
- Transforming Batch and Streaming Data
- BigQuery as a Data Warehousing Solution
Certificate | Modernizing Data Lakes and Data Warehouses with GCP
Today's Progress: Today I continued with the Modernizing Data Lakes and Data Warehouses with GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 1
- Building a Data Warehouse
- The Modern Data Warehouse
- Intro to BigQuery
- Querying TBs of Data in Seconds
- Loading Data
- Lab 3 (BigQuery):
- Loading Data into BigQuery
- Building a Data Warehouse
Today's Progress: Today I continued with the Modernizing Data Lakes and Data Warehouses with GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 1: Module 2
- Building a Data Lake
- Intro to Data Lakes
- Data Storage and ETL options on GCP
- Optimizing Cost with Google Cloud Storage classes and Cloud Functions
- Securing Cloud Storage
- Storing All Sorts of Data Types
- Running Federated Queries on Parquet and ORC files in BigQuery
- Storing Relational Data in the Cloud
- Cloud SQL as a Relational Data Lake
- Lab 1 (Cloud SQL):
- Loading Data into Cloud SQL
- Building a Data Lake
Today's Progress: Today I enrolled in the Modernizing Data Lakes and Data Warehouses with GCP course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 1: Module 1
- Role of Data Engineer
- Data Engineering Challenges
- Intro to BigQuery
- Data Lakes and Data Warehouses
- Transactional Databases vs Data Warehouses
- Effective Partnership with Other Data Teams
- Manage Data Access and Governance
- Build Production-Ready Pipelines
- GCP Customer Case Study
- Ocado
- Lab 1 (Analysis with BigQuery):
- Analyze 2 different public datasets
- Run queries on them, to derive interesting insights
- Separately
- Combined
Coursera | Data Engineering with Google Cloud Professional Certificate
Coursera | Modernizing Data Lakes and Data Warehouses with GCP
Today's Progress: Today I continued with the Google Cloud Platform Big Data and Machine Learning Fundamentals course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 2
- ML Driving Business Value
- ML on Unstructred Data
- Choosing the Right ML Approach
- Pre-Built AI Building Blocks
- Using Pre-Built AI to Create a Chatbot
- Customizing Pre-Built Models with AutoML
- Building a Custom Model
- Lab 5 (AutoML)
- Setup API key for ML Vision API
- Invoke the pretrained ML Vision API to classify images
- Review label predictions from Vision API
- Train and evaluate custom AutoML Vision image classification model
- Predict with AutoML on new image
Certificate | Google Cloud Platform Big Data and Machine Learning Fundamentals
Today's Progress: Today I continued with the Google Cloud Platform Big Data and Machine Learning Fundamentals course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 2: Module 1
- Modern Data Pipeline Challenges
- Message-Oriented Architectures
- Serverless Data Pipelines
- Designing Streaming Pipelines with Apache Beam
- Implementing Streaming Pipelines on Cloud DataFlow
- Data Visualization with Data Studio
- Building Collaborative Dashboards
- Tips and Tricks to create Charts with the Data Studio UI
- Lab 4 (Data Streaming Pipeline)
- Connect to a streaming data Topic in Cloud Pub/sub
- Ingest streaming data with Cloud Dataflow
- Load streaming data into BigQuery
- Analyze and visualize the results
Today's Progress: Today I continued with the Google Cloud Platform Big Data and Machine Learning Fundamentals course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 1: Module 3
- Introduction to BigQuery
- Fast SQL Query Engine
- Managed Storage for Datasets
- Insights from Geographic Data
- Machine Learning on Structured Data
- Choosing the right model type
- Scenario: Predicting Customer Lifetime Value
- Creating ML Models with SQL
- Introduction to BigQuery ML
- ML Projects Phases
- Key Features Walkthrough
- Lab 3 (BigQuery ML)
- Use BigQuery to find public datasets
- Query and explore the ecommerce dataset
- Create a training and evaluation dataset to be used for batch prediction
- Create a classification (logistic regression) model in BQML
- Evaluate the performance of your machine learning model
- Predict and rank the probability that a visitor will make a purchase
- Introduction to BigQuery
Today's Progress: Today I continued with the Google Cloud Platform Big Data and Machine Learning Fundamentals course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 1: Module 2
- Recommendation Systems
- Data
- Model
- Training/Serving Infrastructure
- Google Storage Systems
- Cloud Storage
- Cloud SQL
- Cloud Spanner
- DataStore
- BigTable
- BigQuery
- Hadoop Ecosystem
- Hadoop
- Hive
- Pig
- Spark
- Lab 2 (Recommendation System)
- Create Cloud SQL instance
- Create database tables by importing .sql files from Cloud Storage
- Populate the tables by importing .csv files from Cloud Storage
- Allow access to Cloud SQL
- Explore the rentals data using SQL statements from CloudShell
- Recommendation Systems
Today's Progress: Today I enrolled in the Google Cloud Platform Big Data and Machine Learning Fundamentals course of Data Engineering with Google Cloud Professional Certificate on Coursera.
Description: This includes the following:
- Week 1: Module 1
- Google Cloud Architecture
- Security
- Google IAM
- Compute Power
- On Demand VMs
- Storage
- Multiregional
- Regional
- Nearline
- Coldline
- Networking
- Edge Computing/Node/PoP
- Big Data/ ML Products
- GFS
- MapReduce
- BigTable
- Dremel
- Pub/Sub
- Tensorflow etc
- Security
- Lab 1 (BigQuery)
- Query a public dataset
- Create a custom table
- Load data into a table
- Query a table
- GCP Approaches
- Compute Engine
- Google Kubernetes Engine (GKE)
- App Engine
- Cloud Functions
- Business Use Cases w/ GCP
- Google Cloud Architecture
Important Links: Coursera | Data Engineering with Google Cloud Professional Certificate
Coursera | Google Cloud Platform Big Data and Machine Learning Fundamentals
Today's Progress: Today I ended and reviewd the Time Series Analysis in Python 2020 course on udemy.
Description: I learned following things:
- Differentiate between time series data and cross-sectional data.
- Understand the fundamental assumptions of time series data and how to take advantage of them.
- Transforming a data set into a time-series.
- Start coding in Python and learn how to use it for statistical analysis.
- Carry out time-series analysis in Python and interpreting the results, based on the data in question.
- Examine the crucial differences between related series like prices and returns.
- Comprehend the need to normalize data when comparing different time series.
- Encounter special types of time series like White Noise and Random Walks.
- Learn about "autocorrelation" and how to account for it.
- Learn about accounting for "unexpected shocks" via moving averages.
- Discuss model selection in time series and the role residuals play in it.
- Comprehend stationarity and how to test for its existence.
- Acknowledge the notion of integration and understand when, why and how to properly use it.
- Realize the importance of volatility and how we can measure it.
- Forecast the future based on patterns observed in the past.
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- Business Case: Automobile Industry
- Analysing the data leading up to the Volkswagen buyout of Porsche
- In Retrospect Approach
- The Dieselgate Scandal
- Auto ARIMA
- Predictions using Exogenous Variables
- Measuring Volatility
- GARCH
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- Forecasting
- Forecast vs Prediction
- Forecasting Time Series:
- ARMA
- ARIMA
- ARIMAX
- SARIMA
- SARIMAX
- GARCH
- Pitfalls of Forecasting
- Multivariate Forecasting
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- The Auto ARIMA Model
- The Auto AutoRegressive Integrated Moving Average (ARIMA) Model
- Manual vs Automatic Empirical Analysis
- Pros & Cons
- Fitting Default Best Fit Auto ARIMA Model
- Auto ARIMA with Custom Arguments
- Implementation in Python
- The Auto AutoRegressive Integrated Moving Average (ARIMA) Model
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- The ARCH Model
- The AutoRegressive Conditional Heteroskedasticity (ARCH) Model
- Volatility
- EGARCH
- Fitting a Simple ARCH Model
- Fitting Higher Lag ARCH Model
- Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) Model
- Volatility Clustering
- Implementation in Python
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- The ARIMA Model
- The AutoRegressive Integrated Moving Average (ARIMA) Model
- Fitting a Simple ARIMA Model
- Fitting Higher-Lag ARIMA Model
- Higher Levels of Integration
- Outdide Factors
- Exogeneous Variables
- ARMAX
- ARIMAX
- Seasonal Models
- SARMA
- SARIMA
- SARIMAX
- Predicting Stability
- Volatility & Variance
- Implementation in Python
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- The ARMA Model
- The AutoRegressive Moving Average (ARMA) Model
- Fitting a Simple ARMA Model
- Fitting Higher-Lag ARMA Model
- Examining the ARMA Model Residuals
- ARMA Models and Non-Stationary Data
- Implementation in Python
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- Adjusting to Shocks The MA Model
- Moving Average (MA) Model
- Fitting the MA Model
- Fitting Higher-Lag MA Model
- Examining the MA Model Residuals
- Model Selection for Normalized Returns
- Past Values and Past Errors
- Implementation in Python
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- Modelling AutoRegression
- AR Model
- ACF & PACF
- Dicky-Fuller Test
- LLR Test
- Error Analysis w/ Residuals
- Implementation in Python
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- Picking the Correct Model
- Significant Coefficients
- Parsimonious
- Log-Liklihood Ratio Test
- AIC & BIC
- Residuals
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- Working with Time Series in Python
- White Noise
- Constant Mean
- Constant Variance
- No Autocorrelation
- Random Walk
- Market Efficiency
- Arbitrage
- Stationary
- Covariance Stationary
- Constant Mean
- Constant Variance
- Consistent Covariance b/w different Time Periods
- Determining Weak Form Stationary
- Dickey-Fuller Test
- Covariance Stationary
- Seasonality
- Decomposition
- Trend
- Seasonal
- Residual
- Naive Decomposition
- Additive
- Multiplicative
- Decomposition
- Autocorrelation
- AutoCorrelation Function (ACF)
- Partial AutoCorrelation Function (PACF)
- Python Implementation
- White Noise
Today's Progress: Today I continued the Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- Creating a Time Series Object
- Transforming String Inputs into DateTime Values
- Using Date as an Index
- Setting the Frequency
- Filling Missing Values
- Adding and Removing Columns in DataFrame
- Splitting Up the Data
- Implementation in Python
Today's Progress: Today I enrolled in Time Series Analysis in Python 2020 course on udemy.
Description: This includes the following:
- Introduction
- Setting Up the Environment
- Introduction to Time Series in Python
- Time Periods
- Frequency
- Pattern Persistance
- Notations in Time-Series
- Peculiarities of Time Series
- Implementation in Python
- Loading Data
- Exploring Data
- Visulaizing Data
- QQ Plot
Important Links: Udemy | Time Series Analysis in Python 2020
Today's Progress: Today I ended and reviewed with Python for Finance: Investment Fundamentals & Data Analytics course on udemy.
Description: This includes the following:
- Rate of return of stocks
- Risk of stocks
- Rate of return of stock portfolios
- Risk of stock portfolios
- Correlation between stocks
- Covariance
- Diversifiable and non-diversifiable risk
- Regression analysis
- Alpha and Beta coefficients
- Measuring a regression’s explanatory power with R^2
- Markowitz Efficient frontier calculation
- Capital asset pricing model
- Sharpe ratio
- Multivariate regression analysis
- Monte Carlo simulations
- Using Monte Carlo in a Corporate Finance context
- Derivatives and type of derivatives
- Applying the Black Scholes formula
- Using Monte Carlo for options pricing
- Using Monte Carlo for stock pricing
Today's Progress: Today I continued with Python for Finance: Investment Fundamentals & Data Analytics course on udemy.
Description: This includes the following:
- Part 17: Monte Carlo Simulations as a decision-making Tool
- Monte Carlo Simulations
- Monte Carlo in Corporate Finance Setting
- Revenues
- Cost of Goods Sold
- Gross Profit
- Cogs & Opex
- Asset Pricing with Monte Carlo
- Brownian Motion
- Drift
- Volatility
- Brownian Motion
- Derivative Contracts
- Assets
- Stocks
- Bonds
- Interest Rates
- Commodities
- Exchange Rates
- Groups of Derivatives
- Hedging
- Speculating
- Aribtrageurs
- Types of Derivatives
- Forwards
- Futures
- Swaps
- Options
- Call Options vs Put Options
- Assets
- The Black Scholes Formula
- Efficient Markets
- Transaction Costs
- No Divided Payments
- Known Volatility & Risk-Free
- Implementation in Python
Today's Progress: Today I continued with Python for Finance: Investment Fundamentals & Data Analytics course on udemy.
Description: This includes the following:
- Part 16: Multivariate Regression Analysis
- Fundamentals of Multivariate Regression
- Higher Dimensions
- R-Squared
- P-Value
- Beta Coefficients
- Implementation in Python
- Fundamentals of Multivariate Regression
Today's Progress: Today I continued with Python for Finance: Investment Fundamentals & Data Analytics course on udemy.
Description: This includes the following:
- Part 15: The Capital Asset Pricing Model
- The Capital Asset Pricing Model
- Market Portfolio
- Risk Free Asset
- Beta Coefficient
- Capital Market Line
- Market Risk Premium
- Sharpe Ratio
- Achieving Alpha
- Types of Investment Strategies
- Passive Investing
- Active Investing
- Arbitrage Trading
- Value Investing
- Implementation in Python
- Calculating Beta of a Stock
- Calculating CAPM of a Stock
- Calculating Sharpe Ratio
- The Capital Asset Pricing Model
Today's Progress: Today I continued with Python for Finance: Investment Fundamentals & Data Analytics course on udemy.
Description: This includes the following:
- Part 14: Markowitz Portfolio Optimization
- Markowitz Portfolio Theory
- Single Investment vs Diversified Potfolio
- Markowitz Efficient Frontier
- Implementation in Python
- Calculating Expected Portfolio Return
- Calculating Expected Portfolio Variance
- Calculating Expected Portfolio Volatility
- Calculating Markowitz Efficient Frontier
- Markowitz Portfolio Theory
Today's Progress: Today I continued with Python for Finance: Investment Fundamentals & Data Analytics course on udemy.
Description: This includes the following:
- Part 13: Using Regression for Financial Analysis
- Fundamental of Simple Regression
- Univariate Regression
- Multivariate Regression
- Good vs Bad Regression
- R-squared
- Implementation in Python
- Running Regression model for House Price Data
- Calculating:
- Slope (beta)
- Intercept (alpha)
- R Value
- R-squared Value
- P Value
- Standard Error
- Fundamental of Simple Regression
Today's Progress: Today I continued with Python for Finance: Investment Fundamentals & Data Analytics course on udemy.
Description: This includes the following:
- Implementation in Python
- Calculating Simple Rate of Return
- Calculating Log Rate of Return
- Calculating Rate of Return of Indices
- Calculating Risk of a Security
- Variance
- Correlation
- Volatility
- Calculating Risk of an Investment Portfolio
- Calculating Diversifiable and Non-Diversifiable Risk
Today's Progress: Today I enrolled in Python for Finance: Investment Fundamentals & Data Analytics course on udemy.
Description: This includes the following:
- Part 1-10: Python Refresher
- Skipped
- Part 11: Calculating and Comparing Rates of Return
- Rate of Return of Stocks
- Simple Returns
- Log Returns
- Rate of Return of Stock Portfolios
- Market Indices
- S&P500
- Dow Jones Industrial Average
- NASDAQ
- Market Indices
- Rate of Return of Stocks
- Part 12: Measuring Investment Risk
- Risk of Stocks
- Variability
- Variance
- Standard Deviation
- Variability
- Risk of Stock Portfolios
- Portfolio Diversification
- Covariance
- Correlation
- Un-diversifiable (Systematic) Risk vs Diversifiable (Idiosyncratic) Risk
- Risk of Stocks
Important Links: Udemy | Python for Finance: Investment Fundamentals & Data Analytics
Today's Progress: Today I ended and reviewed the Artificial Intelligence for Business course on Udemy.
- OPTIMIZE BUSINESS PROCESSES
- Implement Q-Learning
- Build an Optimization Model
- Maximize Efficiency
- MINIMIZE COSTS
- Implement Deep Q-Learning
- Build an AI Environment from scratch
- Build an Artificial Brain
- Master the General AI Framework
- Save and Load a model
- Implement Early Stopping
- MAXIMIZE REVENUES
- Implement Thompson Sampling
- Leverage AI to make the best decision
- Implement Online Learning
- Implement Regret Analysis
Today's Progress: Today I continued with the Artificial Intelligence for Business course on udemy.
Description: This includes the following:
- Part-3: Maximizing Revenue
- AI Solution
- The Multi-Armed Bandit Problem
- Thompson Sampling
- Implementation in Python
- AI Solution
Today's Progress: Today I continued with the Artificial Intelligence for Business course on udemy.
Description: This includes the following:
- Part-3: Maximizing Revenue
- Case Study: Maximizing Revenue of an Online Retail Business
- Problem to Solve
- Environment to Define
- Defining States
- Defining Actions
- Defining Reward Function
- Case Study: Maximizing Revenue of an Online Retail Business
Today's Progress: Today I continued with the Artificial Intelligence for Business course on udemy.
Description: This includes the following:
- Part-2: Minimizing Costs
- Case Study: Minimizing Costs in Energy Consumption of a Data Center
- Problem to Solve
- Environment to Define
- Defining States
- Defining Actions
- Defining Reward Function
- AI Solution
- Deep Q-Learning: Intuition
- Deep Q-Learning: Action
- Experience Replay
- Action Selection Policies
- Implementation in Python
- Case Study: Minimizing Costs in Energy Consumption of a Data Center
Today's Progress: Today I enrolled in Artificial Intelligence for Business course on udemy.
Description: This includes the following:
- Introduction to Course
- Part-1: Optimizing Business Process
- Case Study: Optimizing the Flows in an E-Commerce Warehouse
- Problem to Solve
- Environment to Define
- Defining States
- Defining Actions
- Defining Reward Function
- AI Solution
- Intro to Reinforcement Learning
- The Bellman Equation
- The "Plan"
- Markov Decision Process (MDP)
- Deterministic Search
- Non-Deterministic Search
- Policy vs Plan
- Adding a "Living Penalty"
- Q-Learning: Intuition
- Temporal Difference
- Q-Learning: Visualization
- Implementation in Python
- Case Study: Optimizing the Flows in an E-Commerce Warehouse
Important Links: Udemy | Artificial Intelligence for Business Relevant Codebase
Today's Progress: Today I ended and reviewed the Taming Big Data with Apache Spark and Python - Hands On! course on Udemy.
Description: I learned following skills in this course:
- Use DataFrames and Structured Streaming in Spark 3
- Frame big data analysis problems as Spark problems
- Use Amazon's Elastic MapReduce service to run your job on a cluster with Hadoop YARN
- Install and run Apache Spark on a desktop computer or on a cluster
- Use Spark's Resilient Distributed Datasets to process and analyze large data sets across many CPU's
- Implement iterative algorithms such as breadth-first-search using Spark
- Use the MLLib machine learning library to answer common data mining questions
- Understand how Spark SQL lets you work with structured data
- Understand how Spark Streaming lets your process continuous streams of data in real time
- Tune and troubleshoot large jobs running on a cluster
- Share information between nodes on a Spark cluster using broadcast variables and accumulators
- Understand how the GraphX library helps with network analysis problems
Today's Progress: Today I continued the Taming Big Data with Apache Spark and Python - Hands On! course on Udemy.
Description: This includes the following:
- Other Spark Technologies
- Intro to MLLib
- Using DataFrames with MLLib
- Spark Streaming
- GraphX
Today's Progress: Today I continued the Taming Big Data with Apache Spark and Python - Hands On! course on Udemy.
Description: This includes the following:
- Intro to SparkSQL
- DataFrame:
- Executing SQL commands
- SQL-style function
- Using DataFrame instead of RDDs
Today's Progress: Today I continued the Taming Big Data with Apache Spark and Python - Hands On! course on Udemy.
Description: This includes the following:
- Elastic MapReduce
- Partitioning
- Troubleshooting Spark on a Cluster
- Managing Dependencies
Today's Progress: Today I continued the Taming Big Data with Apache Spark and Python - Hands On! course on Udemy.
Description: This includes the following:
- Advance Examples of Spark Programs
- Activities
- Finding Most Popular Movie
- Using Broadcast Variables
- Activities
- Using Graphs
- Superhero Degree of Seperation
- Intro to Breadth-First Search
- Accumlators, and Implementing BFS in Spark
- Superhero Degree of Seperation
- Item-Based Collaborative Fillering in Spark
- cache() and persist()
- Activity
- Similar Movie Script using Spark's Cluster Manager
Today's Progress: Today I continued the Taming Big Data with Apache Spark and Python - Hands On! course on Udemy.
Description: This includes the following:
- Spark Basics and Simple Examples
- Intro to Spark
- Resilient Distributed Dataset (RDD)
- Key/Value RDDs
- Example: Average Friends by Age
- Filtering RDDs
- Example: Minimum Temperature by Location
- Example: Maximum Temperature by Location
- Map vs FaltMap
- Example: Word Count
Today's Progress: Today I enrolled in Taming Big Data with Apache Spark and Python - Hands On! course on udemy.
Description: This includes the following:
- Getting Started with Spark
- Introduction to course
- Setting up the Environment
- Running First Spark Program
- Ratings Histogram for MovieLens Movie Rating Dataset
Important Links:
Today's Progress: Today I learned to use the normal distribution as an approximation of the binomial distribution, when appropriate.
Description: This includes the following:
- Normal Random Variables
- Applications
- Approximation to Binomial
- Continuity Correction
- Applications
- Wrap-Up: Random Variable
Today's Progress: Today I learned to explain how a density function is used to find probabilities involving continuous random variables. I also learned to find probabilities associated with the normal distribution.
Description: This includes the following:
- Continuous Random Variables
- Probability Distribution
- Discrete Random Variable
- Probability Distribution
- Normal Random Variables
- Standard Deviation Rule
- Standardizing Values
- Standard Normal Table
- Introduction
- Finding z value
- Working with Non-standard Normal Values
- Finding X value
Today's Progress: Today I learned to fit the binomial model when appropriate, and use it to perform simple calculations.
Description: This includes the following:
- Binomial Random Variables
- Binomial Experiment
- Probability Distribution
- Mean and Standard Deviation
Today's Progress: Today I learned how to find the mean and variance of a discrete random variable, and apply these concepts to solve real-world problems. I also learned to apply the rules of means and variances to find the mean and variance of a linear transformation of a random variable and the sum of two independent random variables.
Description: This includes the following:
- Discrete Random Variables
- Mean and Variance
- Introduction
- Applications
- Standard Deviation
- Rules for Mean and Variances
- Add, Subtract, Multiplication by Constant
- Linear Transformation
- Sum of Two Variables
- Mean and Variance
Today's Progress: Today I continued on the first course of the Tensorflow: Data and Deployment Specialization.
Description: This includes the following:
- Week 4
- Train a model in your web browser by using images captured via a webcam
- Apply transfer learning to train a model to recognize hand gestures of rock, paper, and scissors
- Apply transfer learning to train a model to recognize hand gestures of rock, paper, scissors, lizard, and spock
Today's Progress: Today I continued on the first course of the Tensorflow: Data and Deployment Specialization.
Description: This includes the following:
- Week 3
- Use a toxicity model to determine if a phrase is toxic in a number of categories
- Use Mobilenet to detect objects in images
- Use the tensorflow.js converter to convert a Keras model to JSON format
Today's Progress: Today I started the first course of the Tensorflow: Data and Deployment Specialization.
Description: This includes the following:
- Week 1
- Use TensorFlow.js to build and train simple machine learning models in JavaScript
- Use Web Server for Chrome to serve web pages from a local folder over the network using HTTP.
- Describe the key characteristics of one-hot encoding
- Use TensorFlow.js to load data from a CSV file
- Week 2
- Use tf-vis to visulize the output of callbacks
- Use a convolutional neural network to build a handwriting classifier
- Use a sprite sheet to train a classifier
Important Links:
Today's Progress: Today I learned to distinguish between discrete and continuous random variables. Also learned to find the probability distribution of discrete random variables, and use it to find the probability of events of interest.
Description: This includes the following:
- Discrete Random Variables
- Random Variables
- Introduction
- Count vs Measure
- Random Variables
- Probability Distribution
- Table of Outcomes
- Probability Histograms
- Applications
- Using Conditional Probability
Today's Progress: Today I learned to use the General Multiplication Rule to find the probability that two events occur (P(A and B)) and to use probability trees as a tool for finding probabilities.
Description: This includes the following:
- Multiplication Rule
- General Multiplication Rule
- Definition
- Applications
- General Multiplication Rule
- Probability Trees
- Definition
- Applications
- Other Methods
- Wrap-Up: Conditional Probability and Independance
Today's Progress: Today I learned about the reasoning behind conditional probability, and how this reasoning is expressed by the definition of conditional probability. Also learned to find conditional probabilities and interpret them, and determine whether two events are independent or not.
Description: This includes the following:
- Conditional Probability
- Reasoning
- Definition
- Independence Check
- Compare P(B | A) and P(B)
- Other Methods
Today's Progress: Today I learned how to apply probability rules in order to find the likelihood of an event. I also learned to use tools such as Venn diagrams or probability tables as aids for finding probabilities, when appropriate.
Description: This includes the following:
- Probability Rules
- Range and Sum Rules
- Complement Rule
- Disjoint Events
- Addition Rule for Disjoint Events
- P(A and B) for Independent Events
- Multiplication Rule for Independent Events
- Extensions
- At Least One of...
- General Addition Rule
- Probability Tables
- Solving Problems
- Wrap-Up: Finding Probability of Events
Today's Progress: Today I learned how to determine the sample space of a given random experiment. I also learned to find the probability of events in the case in which all outcomes are equally likely.
Description: This includes the following:
- Probability of Events
- Sample Spaces
- Random Experiments
- Events of Interest
- Equally Likely Outcomes
- Overview
- Examples
Today's Progress: Today I learned how to relate the probability of an event to the likelihood of this event occurring.I also learned how relative frequency can be used to estimate the probability of an event.
Description: This includes the following:
- Empirical Methods for Determinig Probability
- Verifying Classical Probability
- Relative Frequency
- Definition
- Law of Large Numbers
Today's Progress: Today I learned how to relate the probability of an event to the likelihood of this event occurring.
Description: This includes the following:
- Probability
- Introduction
- The Bigger Picture
- Intuition
- Formal Definition
- Introduction
- Determining Probability
- Theoritical/Classical
- Empirical/Observational
Today's Progress: Today I learned how to identify the design of a study (controlled experiment vs. observational study) and other features of the study design (randomized, blind etc.).
Description: This includes the following:
- Experiments: More than One Explanatory Variable
- Modification to Randomization
- Wrap-Up: Designing Studies
- Summary: Producing Data
Today's Progress: Today I learned how to identify the design of a study (controlled experiment vs. observational study) and other features of the study design (randomized, blind etc.).
Description: This includes the following:
- Experiments: One Explanatory Variable
- Caustaion and Experiments
- Randomized Controll Experiments
- Inclusion of a Control Group
- Blind and Double-Blind Experiments
- Pitfalls
- Caustaion and Experiments
Today's Progress: Today I learned how the study design impacts the types of conclusions that can be drawn. Also learned to determine how the features of a survey impact the collected data and the accuracy of the data.
Description: This includes the following:
- Observational Studies
- Caustaion and Observational Studies
- Lurking Variables
- Other Pitfalls
- Design Issues
- Summary
- Caustaion and Observational Studies
Today's Progress: Today I learned to use Convolutions on top of DNNs and RNNs and then put it all together using a real-world data series -- one which measures sunspot activity over hundreds of years.
Description: This includes the following:
- Week 4: Real-World Time Series Data
- Convolutions
- Bi-Directional LSTMs
- Batch Sizing
- Training and Tunning
- Prediction
Important Links:
- Certificate | Sequences, Time Series and Prediction
- Certificate | TensorFlow in Practice Specialization
Today's Progress: Having explored time series and some of the common attributes of time series such as trend and seasonality, and then having used statistical methods for projection, today I learned neural networks to recognize and predict on time series. I also learned that Recurrent Neural networks and Long Short Term Memory networks are really useful to classify and predict on sequential data.
Description: This includes the following:
- Week 2: Deep Neural Networks for Time Series
- Data Preparation
- Sequence Bias
- Feeding Windowed Data to Neural Network
- Prediction
- Week 3: Recurrent Neural Network for Time Series
- Lambda Layers
- Dynamically adjusting Learning Rate
- Huber Loss
- RNN
- LSTM
Today's Progress: Today I learned about the nature of time series data, and saw some of the more common attributes of them, including things like seasonality and trend. I also looked at some statistical methods for predicting time series data also.
Description: This includes the following:
- Week 1: Sequences and Prediction
- Introduction
- Common Patterns
- Trend
- Seasonality
- White Noise
- Autocorrelation
- Impulses
- Metrics for Evaluation
- MSE
- RMSE
- MAE
- Moving Average and Differencing
- Trailing vs Centered Windows
- Forecasting
Important Links:
Today's Progress: Today I learned to use TensorFlow for various Natural Language Processing problems.
Description: This includes the following:
- Sentiment in Text
- Text to Sequence
- Tokenizer
- Padding
- Word Embeddings
- Introduction
- Vectors
- Loss Function
- Pre-Tokenized Datasets
- Sequence Models
- LSTMs
- Accuracy and Loss
- Convolutional Networks
- Sequence Models and Literature
- Subword Tokenization
- Text Generation
- Shakespearean Poetry Generation
Important Links:
- Coursera | Natural Language Processing in TensorFlow
- Certificate | Natural Language Processing in TensorFlow
Today's Progress: Today I learned to identify the design of a study (controlled experiment vs. observational study) and other features of the study design (randomized, blind etc.).
Description: This includes the following:
- Producing Data: Designing Studies
- Introduction
- Types of Studies
- Experimental Studies
- Obesrvational Studies
- Prospective
- Retrospective
Today's Progress: Today I learned various techniques by which one can choose a sample of individuals from an entire population to collect data from. This is seemingly a simple step in the big picture of statistics, but it turns out that it has a crucial effect on the conclusions we can draw from the sample about the entire population.
Description: This includes the following:
- Producing Data: Sampling
- Types of Samples
- Volunteer Sample
- Convenience Sample
- Sampling Frame
- Systematic Sampling
- Probability Sampling Plans
- Simple Random Sampling
- Cluster Sampling
- Stratified Sampling
- Wrap-Up: Sampling
Today's Progress: Today I summarized how to explore the relationship between the explanatory and response variables using visual displays and numerical measures, and how to choose what kind of measure to use based on the role-type classification of the two variables. I also emphasized how important it is to interpret any observed association in the context of the problem, but NOT to be tempted to interpret association as causation, due to the possible presence of lurking variables.
Description: This includes the following:
- Wrap-Up: Examining Relationships
- Summary: EDA
Today's Progress: Today I learned how to recognize the distinction between association and causation, and identify potential lurking variables for explaining an observed relationship. Association does not imply causation!
Description: This includes the following:
- Causation and Lurking Variables
- Introduction
- Confounds
- Simpson's Paradox
Today's Progress:
Description: This includes the following:
- AWS Builders Online Series
- Introductory Guide to AWS Cost Management and Efficiency
- Move Fast & Be Secure on AWS Cloud
- AWS Purpose-Built Database Strategy: The Right Tool for The Right Job
- Host your Static Website on Amazon Simple Storage Service (S3)
- Building Serverless Applications that Scale
- Project Management Professional Certification: Introduction
Today's Progress: Today I learned about a special case of the relationship between two quantitative variables is the linear relationship. In this case, a straight line simply and adequately summarizes the relationship. When the scatterplot displays a linear relationship, we supplement it with the correlation coefficient (r). The least-squares regression line has the smallest sum of squared vertical deviations of the data points from the line. Extrapolation is the prediction of values of the explanatory variable that falls outside the range of the data.
Description: Following topics were covered:
- Case Q → Q: Linear Relationships
- Introduction
- Correlation
- R - Coefficient of Correlation
- Properties of R
- Regression
- Least Squares Regression
- Intercept and Slope
- Predictions
Today's Progress: Today I learned how to graphically display the relationship between two quantitative variables and describe: a) the overall pattern and b) striking deviations from the pattern.
Description: Following topics were covered:
- Case Q → Q
- Two Quantitative Variables
- Scatterplots
- Introduction
- Interpretation
- Examples
- Labeled
- Exercises
Today's Progress: Today I learned about the C → C relationship between two categorical variables. Building a two-way table and interpreting the information stored in it about the association between two categorical variables by comparing conditional percentages.
Description: Following topics were covered:
- Case C → C
- Two Categorical Variables
- Conditional Percents
- Exercises
Today's Progress: Today I learned about how to examine relationships between 2 variables using visual displays and numerical summaries.
Description: These includes the following topics:
- EDA: Examining Relationships
- Exploring Two Variables: Explanatory and Response
- Role-Type Classification
- Case C → Q
- Introduction
- Applications
Today's Progress: Today I learned that the range covered by the data is the most intuitive measure of spread and is exactly the distance between the smallest data point (min) and the largest one (Max). Another measure of spread is the inter-quartile range (IQR), which is the range covered by the middle 50% of the data. The IQR can be used to detect outliers using the 1.5(IQR) criterion. Outliers are observations that fall below Q1 - 1.5(IQR) or above Q3 + 1.5(IQR). The five-number summary of distribution consists of the median (M), the two quartiles (Q1, Q3) and the extremes (Min, Max). The standard deviation measures the spread by reporting a typical (average) distance between the data points and their average.
Description: These includes the following topics:
- One Quantitative Variable: Measure of Spread
- Range
- Inter-Quartile Range
- Using IQR to detect outliers
- Outliers
- Identification
- Understanding
- Handling
- Boxplots
- Standard Deviation
- Idea
- Notion
- Calculation
- Properties
- Standard Deviation Rule
- WrapUp: EDA
Today's Progress: Learned how to quantify the center and spread of distribution with various numerical measures, some of the properties of those numerical measures; and how to choose the appropriate numerical measures of center and spread to supplement the histogram.
Description: This includes the following:
- One Quantitative Variable: Measures of Center
- Introduction
- Mode
- Median
- Mean
- Comparison b/w Mean and Median
Today's Progress: Learned about uni-quantitative variables and how to represent it using Histogram and Stemplot. I also learned about how to interpret these graphs for further insights.
Description: This includes the following:
- One Quantitative Variable: Graphs
- Introduction
- Histogram
- Intervals
- Shape
- Center, Spread, and Outliers
- Stemplot
Today's Progress: Got a formal intro to statistics, learned about Exploratory Data Analysis and One Categorical Variable
Description: This includes the following:
- Introduction to Statistics
- Exploratory Data Analysis Overview
- Data and Variables
- Scales of Measurement
- Examining Distributions
- One Categorical Variable
- Frequency Distributions
- Pie and Bar Charts
- Pictograms
Today's Progress:
- Setup Repository for #thepersonalmsds
- Created template for Social Media
- Enrolled in Stanford University's Probability & Statistics Course
Description: None
Important Links: Stanford | Probability & Statistics