# Introduction
### Open Source Document

This textbook is for specifically for kinesiology graduate students seeking to learn more about quantitative methods. It seeks to blend several disciplines into a linear concept: computer science, statistics, anatomy, skateboarding, and climate.

The chapter number and a brief description is as follows:

  0. Chapter 0: Introduction to Computers
      - Uses Python to introduce how computers work.
      - By the end of the class you should be able to execute code using existing Python libraries.
  1. Chapter 1: CSV Parser
      - Develops a basic version of the CSV parser as seen in the Pandas library.
      - The goal is to learn more about computer science concepts Time and Space complexity.
  2. Chapter 2: Descriptive Statistics
      - Efficiently implement basic descriptive statistics to further an understanding of how computers work.
  3. Chapter 3: Python Graph
      - Create a black and white python graph using no libraries.
  4. Chapter 4: Skateboarding and Anatomy
      - Discusses learning basic skateboarding tricks and the anatomy and physiological functions neccessary to avoid severe bodily injury.
  5. Chapter 5: Geoserver in the Cloud
      - Tutorial to install Geoserver on linux and display climate raster timeseries data in a provider such as Google Cloud Platform.
      - The complexities of describing climatology data is challenging. By the time you being to summarize the information for the layperson all meaning is binned into loaded terms with preconcieved assumptions.
  6. Chapter 6: Linux Automation
      - Describes how to use BASH to install several R and Python libraries to download rasters, transform the data into the format GeoTIFF, and perform calculations.
      - The original data is a raster format called BIL that represents monthly temperature and precipitation data.
      - This is the first two steps of Extract, Transform, and Load (ETL).
  7. Chapter 7: GDAL Installation
      - Installs GDAL to build the GeoTIFF rasters in XYZ tile format, which will be hosted on the cloud web server and displayed in the next chapters on web development.
      - It is the Load part of ETL.
  8. Chapter 8: GIS Web Development
      - Uses the JavaScript package manager NPM to install Angular JS and OpenLayers to display the rasters.
  9. Chapter 9: Quantiles
      - Example program that uses Python's Numpy to generate quantiles to more effectively visualize the information.
  10. Chapter 10: AngularJS
      - Builds a timeslider to more effectively view the Nino34 index and the statisical summaries for the raster data.
  11. Conclusions
      - todo

Graduate level students in sociology, psychology, music, or any other discipline that already has a firm understanding of basic programming and mathematics would benefit from learning the material. It might also be interesting for engineering, computer science, or statistics undergraduates to see their material applied to something besides the usual textbook example problem "a train is moving at 50 mph how long does it take to stop without shoes."

The Jupyter Notebooks are designed to include the number of the week the work that will be performed. A brief title followed by notes, homework assignments, and example code is included in each week.

## Chapter 0
### Introduction to Computers

0. Introduction

    - Goal of this chapter is to learn basic concepts about computers through programming in Python.

    - The user should be able to read data, write for loops, and save the information. If you can do that already, skip to the next chapter.

    Prerequisites: Typing on a computer. Navigating a web browser. Not clicking on viruses.

1. Reading Data: Part 1 of 1

  **Week 1 Homework: Read data with Pandas**

    Grading Scale:
    0. Did not attempt.
    1. Reads the included CSV into Pandas (/data/P0.csv).
    2. Prints the data.

    **Homework Notes**
    - The Python library 'Pandas' simplifies reading, performing calculations, and writing data.

    - Learning the 'Pandas' library first teaches basic programming concepts that become more complicated without libraries. It will assist in learning the fundamentals for computer science as applied in the next chapters.

    - Tables of information include horizontal rows and vertical columns of information.

    - The first row or header describes the contents of each vertical column. Example:

    ```
    ID | Date     | Stamina  <---- This is the header row (0)
    0  | 20200410 | 0        <---- This is the first data row (1)
    1  | 20200411 | 3        <---- This is the second data row (2)
    2  | 20200412 | 2        <---- etc.
    3  | 20200413 | 5
    ```

    - Table formats include:
      - Excel Document
      - Comma Seperated Values, Tab Seperated Values
      - Javascript Object Notation (JSON)
      - They are all tables
    - JPGs are similar
      - They are fancy tables or matrix of numbers
      - Each number represents a pixel value in Red, Green, Blue (RGB)
      - Example model ranging from 0-255:
        - rgb(127, 173, 187) -> kind of greenish/blue
        - Google the term: "RGB Example"
      - Image formats have an encoder/decoder to represent data as colors to the computer
      - Video driver is written in a lower level programming language to display the information on a screen

  **More Homework Notes: Python Options:**

    - DO NOT root your Phone. Espeically to install a third party commandline application.
      - There are several applications in the iOS and Google Play stores or on the internet that allow you to install a Python development environment.
      - But they expose the lower level code in your phone that is normally unavailable.
      - Virtual machines and cloud platforms are available for non computer scientists to learn about computers without falling into a viral hellscape.
        - Viruses are very sophisticated now and can impersonate your friends and family with computer vision and audio mimicking to extract your personal information.
        - Probability theory using your image, recording your voice, and knowledge about your habits from GPS and browser history can make life very uncomfortable.
        - One example is sending audio from the phone to make it sound like you said something negative about a random person on the street, causing fights or worse.
        - Another is being manipulated into donating your savings to a random account. These aren't operated by humans, they're usually automated programs written by unemployed people.

    - No installation options (Android, iOS, any browser)
      - Google Colab <- the course uses this one. The course is written using Jupyter Notebooks. A free gmail account.
      - Enroll in Google Cloud Platform, Amazon Web Services, or other cloud platform. Usually requires a credit card but they would give you a set amount of credits. Then choose the virtual command line or terminal of your choice. I would not use your primary gmail for this option.

    - Windows Python
      - Command prompt and text editor (notepad++)
      - Integrated Development Environment (Spider, Visual Studio)
      - Jupyter Notebook
    - Mac OS Python
      - Terminal and text editor (Sublime, TextEdit)
      - Jupyter Notebook
    - Linux Python
      - Terminal and text editor (vim, nano, geddit)
      - Jupyter Notebook

  **Week 1 Homework Reminder**

2. Reading Data Part: 2 of 2

  **Week 1 Homework Reminder**

    - Python examples:
      - How to import a library:
        ```
        import pandas
        ```
        or a library referenced as an abbreviation:
        ```
        import pandas as pd
        ```
        or a specific method from a library:
        ```
        from pandas import read_csv
        ```
      - How to set a variable:
        ```
        first_variable = "Hello World"
        ```
      - How to print a variable:
        ```
        first_variable = "Hello World"
        print(first_variable)
        ```
      - How to use a Python method:
        ```
        dataFrame = pd.read_csv('/the/string/path/to/the/file/P0.csv')
        ```

  **File Format Notes**

    - MP4 video files are a series of images, similar to flipping through a deck of cards
      - 30 frames per second (FPS) or 30 images shown sequentially to give the illusion of motion on the screen
    - As bandwidth and computer memory/RAM increased, the feasability to transmit more information increased
    - The 1970's and 1980's internet was universities and governments transmitting text on kilobit sized computers the size of a room
    - By the 1990's, home computers had become common with the dot com bubble and people started transmitting low resolution images
    - 2000's increased bandwidth and computer memory allowed video to be exchanged rapidly
    - 2010's internet replaced TV similar to how the TV replaced radio in the 1960's

  **Week 1 Homework Reminder**

3. Data Types

  **Week 1 Homework Due**

  Example:

  ```
  import pandas as pd
  df = pd.read_csv('/content/P0.csv')
  print(df)
  # If you want the first 5 rows
  print(df.head(5))
  ```

  **Week 3 Homework: Data Types**

    Grading Scale:
    0. Did not attempt.
    1. Extract the 4th column 'Stm' in the P0.csv and print the data type. Remember that arrays start at 0 in Python.
    2. Extract and change the data type from an 'int64' to a 'str'.

  **Datatype Notes**

  - Python standard library has several datatypes which are very similar to the extended Pandas library.

      - 'str' = text
        ```
        string_variable = "This is a string."
        ```
      - 'int' = number
        ```
        int_variable = 20
        ```
      - 'float' = decimal number
        ```
        float_variable = 20.1
        ```
      - Arrays are similar to tables and have advantages and disadvantages for the types of calculations being performed.
        - Immutable arrays mean unchanable.
        - Ordered arrays means they are in order.
        - Arrays can also be changeable or unchangeable.
      - 'list' = ordered, changeable, and allow duplicate values. It is useful for matrix mathematics.
        ```
        one_list = ["This", "is", 1, "List"]
        ```
      - 'tuple' = ordered, unchangeable, and allow duplicate values. Tuples are typically faster than lists.
        ```
        one_tuple = (4, 4, 3)
        ```
      - 'range' = range between numbers, for example:
        ```
        range(5)  # 0, 1, 2, 3, 4
        range(1, 6)  # 1, 2, 3, 4, 5
        range(2, 10, 2)  # 2, 4, 6, 8
        ```
      - 'dict' = a dictionary is ordered (from Python 3.7 onwards), changeable, and do not allow duplicates. This would be useful in a database with usernames so no one would have the exact same name.
        ```
        dict_variable = {"name" : "Google", "age" : 21}
        ```
      - 'bool' = datatype returns as True or False. They are used in conditional statements
        ```
        bool_variable = True
        ```
      - 'NoneType' = If your program returns a value of 'NoneType', there is no data returned

      - There are several other examples and an exhaustive list can be found by using a search engine like Google, Bing, Duckduckgo, etc.
      
      - The W3 Schools website specifically has syntax tutorials and quizes in several other programming languages besides Python.

      - Otherwise the official python documentation is a great source for referencing syntax.

  **Week 3 Homework Reminder**

4.

  **Week 3 Homework Reminder**

  **Python Syntax Notes**

    - How to set variable as a dataFrame column in pandas:
      ```
      import pandas as pd
      df = pd.read_csv('/content/P0.csv')
      df_1 = df['Stm']
      print(df_1)
      ```
    - How to print the datatype of each column:
      ```
      print(df.dtypes)
      ```
    - The dataframe are all 'int64'. It is a specific type of integer using 64 bit binary

  **Week 3 Homework Reminder**

5.

  **Week 3 Homework Reminder**

  

  **Week 3 Homework Reminder**

6.
  **Week 3 Homework Due**

  Example
  ```
  import pandas as pd
  df = pd.read_csv('/content/P0.csv')
  # Displays the datatype of each column as int64
  print(df.dtypes)
  # Creates a new variable called 'df_1' as a 'string' dtype
  df_1 = df['Stm'].astype(pd.StringDtype())

  ```

7. Midterm

8. Break



In [None]:
import pandas as pd

In [None]:
df = pd.read_csv('/content/P0.csv')

In [None]:
df_1 = df['Stm'].astype(pd.StringDtype())

Unnamed: 0,Stm
0,4
1,2
2,3
3,4
4,4
5,5
6,5
7,5
8,2
9,4
