# 20-09-02: Daily Practice

---
---

## Practices

### Daily

* [x] [Job Search](#2-Hour-Job-Search)
  * [x] 2HJS progress
  * [ ] Network
  * [ ] Social media
* [ ] [Meta-data: read & write](#Meta-data:-reading-&-writing)
* [x] [Practice & learn](#Practice-&-learn)
  * [x] Data science and machine learning
  * [ ] Algorithms, data structures, and brain teasers
  * [x] Interviewing, STAR storytelling

### Weekly

Vim command(s) to practice this week:

* `^` : jump to first non-blank character in the current line

---
---

## 2-Hour Job Search

### Cover Letters

I succeeded in staying focused and succinct in my writing for a while longer than usual today. Stoked about the position!

---
---

## Meta-data: read & write

---
---

## Practice & learn

---

### Data science and machine learning

Finished chapter 4 of Grokking Deep Learning. I like how he outlines the intuition and reasoning behind gradient descent. This way of thinking about it is a little different from how I've thought about it before — much more fundamental. Insights like that are why I decided to read through this book before diving into more advanced ones — it helps to have diverse perspectives.

### Grokking Deep Learning - Chapter 4: Gradient Descent

* Do neural networks make accurate predictions?
* Why measure error?
* Hot and cold learning
* Calculating both direction and amount from error
* Gradient descent
* Learning is just reducing error
* Derivatives and how to use them to learn
* Divergence and alpha

> Or, the "compare" and "learn" part of the "predict, compare, and learn" process.

Compare provides a measurement of how far off a prediction was, while learning tells each weight how it can change to reduce the error.

#### Compare

Does your network make good predictions?

* Measuring error simplifies the problem
* Different ways of measuring error prioritize error differently
  * Mean squared error amplifies large error and minimizes small error
* Also ensures that the error is always positive (there can't be negative error!)

In [19]:
knob_weight = 0.5
input = 0.5
goal_pred = 0.8

pred = input * knob_weight
error = (pred - goal_pred) ** 2

print(error)

0.30250000000000005


#### Calculating both direction and amount from error

AKA gradient descent!

* Stopping: if `input` is 0 then `direction_and_amount` is also 0 - there's nothing to learn
* Negative reversal: multiplying pure error by `input` will reverse the sign of `direction_and_amount` if `input` is negative
  * Ensures `weight` moves in the right direction even if `input` is negative
* Scaling: if `input` is large, so should be `weight`

In [1]:
weight = 0.5
goal_pred = 0.8
input = 0.5

for iteration in range(20):
    pred = input * weight
    # `(pred - goal_pred)` is the pure error
    error = (pred - goal_pred) ** 2
    # input: scaling, negative reversal, and stopping
    direction_and_amount = (pred - goal_pred) * input
    # Recalculate weight
    weight = weight - direction_and_amount
    
    print(f"Error: {error}; Prediction: {pred}")

Error: 0.30250000000000005; Prediction: 0.25
Error: 0.17015625000000004; Prediction: 0.3875
Error: 0.095712890625; Prediction: 0.49062500000000003
Error: 0.05383850097656251; Prediction: 0.56796875
Error: 0.03028415679931642; Prediction: 0.6259765625
Error: 0.0170348381996155; Prediction: 0.669482421875
Error: 0.00958209648728372; Prediction: 0.70211181640625
Error: 0.005389929274097089; Prediction: 0.7265838623046875
Error: 0.0030318352166796153; Prediction: 0.7449378967285156
Error: 0.0017054073093822882; Prediction: 0.7587034225463867
Error: 0.0009592916115275371; Prediction: 0.76902756690979
Error: 0.0005396015314842384; Prediction: 0.7767706751823426
Error: 0.000303525861459885; Prediction: 0.7825780063867569
Error: 0.00017073329707118678; Prediction: 0.7869335047900676
Error: 9.603747960254256e-05; Prediction: 0.7902001285925507
Error: 5.402108227642978e-05; Prediction: 0.7926500964444131
Error: 3.038685878049206e-05; Prediction: 0.7944875723333098
Error: 1.7092608064027242e-05; 

In [3]:
# Perform weight update / gradient descent on single training example

weight = 0.1
alpha = 0.01

def neural_network(input, weight):
    prediction = input * weight
    return prediction

weight, goal_pred, input = (0.0, 0.8, 0.5)

for iteration in range(4):
    # Make prediction, calculate error and delta
    pred = neural_network(input, weight)
    error = (pred - goal_pred) ** 2
    delta = pred - goal_pred
    weight_delta = input * delta
    weight = weight - weight_delta
    print(f"Error: {error}; Prediction: {pred}")

Error: 0.6400000000000001; Prediction: 0.0
Error: 0.3600000000000001; Prediction: 0.2
Error: 0.2025; Prediction: 0.35000000000000003
Error: 0.11390625000000001; Prediction: 0.4625


#### What is `weight_delta`, really?

What is a function, and how can one be understood?

* A function defines some sort of relationship between the input and output
* Every function has moving parts that can be tweaked to change the output

    error = ((input * weight) - goal_pred) ** 2

`weight` is the only part of the function above that can be changed while still conforming to the patterns in the data

> Key point: learning is adjusting `weight` to reduce `error` to 0.

##### Derivatives

The derivative describes how one variable moves when another is changed. Or, it's the slope describing the relationship.

This slope can be used to figure out how to force one to change in a specific direction by modifying the other.

The important thing is to understand what the derivative represents.

Again again again! — the relationship between two variables in a function, describing how much one changes in response to the other changing. It's the sensitivity between two variables.

##### Using a derivative to learn

> `weight_delta` is the derivative!

To use the derivative in finding the minimum `error`, move (change `weight`) in the opposite direction of the slope (derivative).

##### Divergence and alpha

With big `input`, the prediction is very sensitive to changes in `weight`. Small changes in `weight` will lead to massive swings in the prediction.

This can be combatted using alpha, or the learning rate. I.e. multiply the `weight_delta` by a fraction.

In [5]:
weight = 0.5
goal_pred = 0.8
# Earlier implementation would break with this input
# but this one doesn't, thanks to our friend alpha!
input = 2
alpha = 0.1

for iteration in range(20):
    pred = input * weight
    error = (pred - goal_pred) ** 2
    derivative = input * (pred - goal_pred)
    weight = weight - (alpha * derivative)
    print("Error:" + str(error) + " Prediction:" + str(pred))

Error:0.03999999999999998 Prediction:1.0
Error:0.0144 Prediction:0.92
Error:0.005183999999999993 Prediction:0.872
Error:0.0018662400000000014 Prediction:0.8432000000000001
Error:0.0006718464000000028 Prediction:0.8259200000000001
Error:0.00024186470400000033 Prediction:0.815552
Error:8.70712934399997e-05 Prediction:0.8093312
Error:3.134566563839939e-05 Prediction:0.80559872
Error:1.1284439629823931e-05 Prediction:0.803359232
Error:4.062398266736526e-06 Prediction:0.8020155392
Error:1.4624633760252567e-06 Prediction:0.8012093235200001
Error:5.264868153690924e-07 Prediction:0.8007255941120001
Error:1.8953525353291194e-07 Prediction:0.8004353564672001
Error:6.82326912718715e-08 Prediction:0.8002612138803201
Error:2.456376885786678e-08 Prediction:0.8001567283281921
Error:8.842956788836216e-09 Prediction:0.8000940369969153
Error:3.1834644439835434e-09 Prediction:0.8000564221981492
Error:1.1460471998340758e-09 Prediction:0.8000338533188895
Error:4.125769919393652e-10 Prediction:0.80002031199

#### Chapter 4: Review and solifidy


* [ ] Build the code for the above neural network implementation from scratch, without looking at notes
  * Leave comments ELI5 every single line

---

### Algorithms, data structures, and brain teasers

---

### Interviewing and STAR storytelling

This question was asked by: LinkedIn

Let's say that you're working on a job recommendation engine. You have access to all user Linkedin profiles, a list of jobs each user applied to, and answers to questions that the user filled in about their job search.

Using this information, how would you build a job recommendation feed?