## Project 0 - Creating Z-scores from scratch

In this [DEMO project](https://github.com/Alex-Caian/GIT_Demo) we will create Z-scores from scratch. We will test it our manual calculations against another library to check the results.

This project only consists of two parts:

1. Generating Z-scores

2. Testing the Z-scores

We'll start by importing necessary libraries.

In [1]:
import numpy as np ## For math 
import math ## For math 

from sklearn.preprocessing import StandardScaler ## For testing our results

### Part 1 - Generating Z-scores

We first define a couple of cases to test on, in the form of arrays/lists:

In [2]:
test1 = [1,5,2,6,7,2,3,6,2,1,0]
test2 = [3,3,2,6,2,2,4]
test3 = [1,2,5,3,3,6,7,5,14,35,2,1,4,6,2,99]

First, we need to know the formula for Z scores:

<img src="https://github.com/Alex-Caian/GIT_Demo/blob/main/Zscore.png?raw=True" />

As per the formula, there are two elements we need to know:

> The mean

> The standard deviation

Let's start with the easier one.

#### The mean

In [6]:
def mean(array):
    assert(hasattr(array, '__iter__')), "Not an iterable."
    if len(array) == 0:
        return 0
    return sum(array)/len(array)

In [7]:
## Run a test to check that it works
mean(test1)

3.1818181818181817

In [8]:
## This one breaks.. need to look over it again?
mean([])
## Change 1

0

Next up, the standard deviation. 

<img src="https://github.com/Alex-Caian/GIT_Demo/blob/main/stdev.png?raw=true" />

Once again, we need to make use of the mean. Good thing we defined the function already!

#### The standard deviation

In [9]:
def stdev(array):
    numerator = sum([(number - mean(array))**2 for number in array])
    denominator = len(array)
    return np.sqrt(numerator/denominator)

In [10]:
## Again, run a test to check it works:
stdev(test1)
## Change 2

2.289032420366213

Finally, we can bring them together to generate the Z-scores for our array!

In [11]:
def Zscore(array):
    return [(number - mean(array))/stdev(array) for number in array]

In [12]:
## Initial sanity test
Zscore(test1)

[-0.9531617649474451,
 0.7943014707895377,
 -0.5162959560131993,
 1.2311672797237834,
 1.668033088658029,
 -0.5162959560131993,
 -0.07943014707895368,
 1.2311672797237834,
 -0.5162959560131993,
 -0.9531617649474451,
 -1.3900275738816907]

### Part 2 - Testing our results

In this part we will test the results of our Zscores against the standardscaler. We start by initialising a standard scaler and transforming the test case.

In [13]:
st = StandardScaler() ## Create
test1 = np.array(test1) ## Make it into an array
test1 = test1.reshape(-1,1) ## Reshape it

Ztest1 = st.fit_transform(test1) ## Transform
print(Ztest1) ## Print results

[[-0.95316176]
 [ 0.79430147]
 [-0.51629596]
 [ 1.23116728]
 [ 1.66803309]
 [-0.51629596]
 [-0.07943015]
 [ 1.23116728]
 [-0.51629596]
 [-0.95316176]
 [-1.39002757]]


Let's test that it worked:

In [14]:
np.array(Zscore(test1)) == Ztest1

array([[ True],
       [ True],
       [ True],
       [ True],
       [ True],
       [ True],
       [ True],
       [ True],
       [ True],
       [ True],
       [ True]])

Time to do test the other ones too! Can we find a nicer way to test if the results are the same? If the array contains 100 numbers for example I don't want to output 100 values of True...

In [None]:
## UNFINISHED WORK
## CAN YOU HELP ME??

In [15]:
## Change 3??
np.allclose(Zscore(test1), Ztest1)

True

### Part 3 - ???

Is there anything else we could improve here? Oh, not the code! Sure.. can always improve that, but..

**What would make this file even more readable?** Can YOU:
(i had to go make food after this session so i wont be doing the busy work of cleaning ill just be pushing this for demonstration purposes to learn GIT thank you for the workshop as always ALEX!! ^_^)

> Tidy it up more! Font, code structure, comments, markdown.. these make the difference between a project people will want to fork and use and one they will be scared to touch.

> Update YOUR ReadME! ReadME's are important parts of any repo, so people know what they're looking at!

> Documentation & referencing. This is obviously a trivial case, but we work with lots of models, algorithms & techniques! Make sure to keep a documentation file available as well.

> Cleaning & project structure. We explained our steps quite in depth here for such an easy problem. But, again, your next project likely won't be finding some Z-scores!

It's your turn!!

Step 1. [Fork this repo](https://docs.github.com/en/get-started/quickstart/fork-a-repo)

Step 2. Make all the changes you want to make

Step 3. [Create a pull request](https://docs.github.com/en/desktop/contributing-and-collaborating-using-github-desktop/working-with-your-remote-repository-on-github-or-github-enterprise/creating-an-issue-or-pull-request) to help me update my work and make it better!

Step 4. Love Github for the rest of your life! It's one of the most amazing collaborative technologies which allows you to build an online portfolio and store all your hard work forever. Get started with the Github community directly on [Github](https://github.com/community) and/or join us on [reddit](https://www.reddit.com/r/github/)!