# Python 0-30 for Scientists
----------------

Welcome to the Python 0 to 30 for Scientists tutorial. In this self-paced course you will learn how to write Python 
code using Python best practices. 

Part 1 is designed to take one work day, but you may move through the content much slower or more quickly. 

Through these instructions you will develop scripts and use git and GitHub to save and organize your work. 

At the end of this tutorial you will have a grasp of how to begin building your own library of Python tools for your 
scientific analysis workflows.

## Why Python?
-------------------

You're already here because you want to learn to use Python for your data analysis and visualizations. Python can be 
compared to other high-level, interpreted, object-oriented languages, but is especially great because it is free and 
open source! 

**High level languages** -

    Other high level languages include MatLab, IDL, and NCL. The advantage of high level languages is that they provide functions, data structures, and other utilities that are commonly used, which means it takes less code to get real work done. The disadvantage of high level languages is that they tend to obscure the low level aspects of the machine such as: memory use, how many floating point operations are happening, and other information related to performance. C and C++ are all examples of lower level languages. The "higher" the level of language, the more computing fundamentals are abstracted.

**Interpreted languages** -

    Most of your work is probably already in interpreted languages if you've ever used IDL, NCL, or MatLab (interpreted languages are typically also high level). So you are already familiar with the advantages of this: you don't have to worry about compiling or machine compatability (it is portable). And you are probably familiar with their deficiencies: sometimes they can be slower than compiled languages and potentially more memory intensive than. 

**Objected Oriented languages** -

    Objects are custom datatypes. For every custom datatypes, you usually have a set of operations you might want to conduct. For example, if you have an object that is a list of numbers you might want to apply a mathematical operation, such as sum, onto this list object in bulk. Not every  function can be applied to every datatype; it wouldn't make sense to apply a logarithm to a string of letters or to capitalize a list of numbers. Data and the operations applied to them are grouped together into one object. 

**Open source** - 

    Python as a language is open source which means that there is a community of developers behind its codebase. Anyone can join the developer community and contribute to deciding the future of the language. When someone identifies gaps to Python's abilities, they can write up the code to fill these gaps. The open source nature of Python means that Python as a language is very adaptable to shifting needs of the user community.

Python is a language designed for rapid prototyping and efficient programming. It is easy to write new code quickly 
with less typing.

## Part 1 - First Python Script
----------------------------

This section of the Zero to Thirty tutorial will focus on teaching you Python through the creation of your first script. 

You will learn about syntax and the reasoning behind why things are done the way they done along the way. 

We will also incorporate lessons on the use of git because we highly you recommend version controling your work.

We are assuming you are familiar with bash and terminal commands. If not here is a cheat sheet:
![Linux Command Sheet](https://cheatography.com/davechild/cheat-sheets/linux-command-line/)

## Part 1a - Reading a .txt File
--------------------------------------
In building your first Python script we will set up our workspace, read a .txt file, and learn git fundamentals.

Open a terminal to begin.

1. [bash] Create a directory:

   ```bash
   $ mkdir ncar_python_tutorial
   ```

   The first thing we have to do is create a directory to store our work. Let's call it "ncar_python_tutorial."

2. [bash] Go into the directory:

   ```bash
   $ cd ncar_python_tutorial
   ```

3. [conda] Create a virtual environment for this project:

   ```bash
   $ conda create --name ncar_python_tutorial python
   ```

    A **conda environment** is a directory that contains a collection of packages or libraries that you would like installed and accessible for this workflow. Type `conda create --name` , the name of your project, here that is "ncar_python_tutorial," and then specify that you are using python to create a virtual environment for this project.

   It is a good idea to create new environments for different projects because since Python is open source, new versions of the tools you use  may become available. This is a way of guaranteeing that your script will use the same versions of packages and libraries and should run the same as you expect it to.

4. [git] Make the directory a git repository:

   ```bash
   $ git init .
   ```
   
   A **Git repository** tracks changes made to files within your project. It looks like a `.git/` folder inside that project.
   
   This command adds version control to this new ncar_python_tutorial directory and all of its contents.

5. [bash] Create a data directory:

   ```bash
   $ mkdir data
   ```

   And we'll make a directory for our data.

6. [bash] Go into the data directory:

   ```bash
   $ cd data
   ```
   Let's "cd" into the data directory. 

7. [bash] Download sample data:

   ```bash
   $ curl -O https://sundowner.colorado.edu/weather/atoc8/wxobs20170821.txt
   ```

   And download data from the CU Boulder weather station.

   This weather station is a Davis Instruments wireless Vantage Pro2 located on the CU-Boulder east campus at the SEEC building (40.01 N, 05.24 W, 5250 ft elevation). The station is monitored by the Atmospheric and Oceanic Sciences (ATOC) department and is part of the larger University of Colorado ATOC Weather Network.


8. [git] Check the status of your repository

   ```bash
   $ git status
   ```

   You will see the newly downloaded file listed as an "untracked file." Git status will tell you what to do to untracked files. Those instructions mirror the next 2 steps:

9. [git] Add the file to the *git staging area*:

   ```bash
   $ git add wxobs20170821.txt
   ```
   
   By adding this datafile to your directory, you have made a change that is not yet reflected in our git repository. Type "git add" and then the name of the altered file to stage your change.

10. [git] Check your git status once again

   ```bash
   $ git status
   ```

   Now this file is listed as a "change to be commited," i.e. staged.
   Staged changes can now be *commited* to your repository history.

11. [git] Commit the file to the *git repository*:

   ```bash
   $ git commit -m "Adding sample data file"
   ```
   
   And then with "git commit", update your repository with all the changes you staged, in this case just one file. 

12. [git] Look at the git logs:

   ```bash
   $ git log
   ```
   
   If you type "git log" you will show a log of all the commits, or changes made to your repository.

13. [bash] Go back to the top-level directory:

   ```bash
   $ cd ..
   ```

14. [bash] Create a blank Python script:

   ```bash
   $ touch mysci.py
   ```

   And now that you've set up our workspace, create a blank Python script, called "mysci.py"

15. [python] Edit the `mysci.py` file using nano, vim, or your 
   favorite text editor:

   ```python
   print("Hello, world!")
   ```

   Your classic first command will be to print "Hello World".

16. [python] Try testing the script:

   ```bash
   $ python mysci.py
   ```
   
   And test that the script works by typing "python" and then the name of your script. 
   
   **Yay!** You've just created your first Python script. 
   

17. [python] You probably won't need to run your Hello World script again, so delete the `print("Hello, world!")` line and start over with something more useful - we'll read the first 4 lines from our datafile.

   Change the `mysci.py` script to read:
   
   ```python
   # Read the data file
   filename = "data/wxobs20170821.txt"
   datafile = open(filename, 'r')

   print(datafile.readline())
   print(datafile.readline())
   print(datafile.readline())
   print(datafile.readline())

   datafile.close()
   ```

   And test your script again by typing:

   ```bash
   $ python mysci.py
   ```
 
   First create a variable for your datafile name, which is a **string** - this can be in single or double quotes.

   Then create a variable associated with the opened file, here it is called `datafile`. 
   
   The `'r'` argument in the `open` command indicates that we are opening the file for reading capabilities. Other input arguments for `open` include `'w'`, for example, if you wanted to write to the file.

   The `readline` command moves through the open file, always reading the next line.

   And remember to `close` your datafile.

   **Comments** in Python are indicated with a hash, as you can see in the first line `# Read the data file`. Comments are ignored by the interpreter.

   Testing of your script with `python mysci.py` should be 
   done every time you wish to execute the script. This will 
   no longer be specified as a unique step in between every 
   change to our script.

18. [python] Change the `mysci.py` script to read your whole data file:

   ```python
   # Read the data file
   filename = "data/wxobs20170821.txt"
   datafile = open(filename, 'r')
   data = datafile.read()
   datafile.close()

   # DEBUG
   print(data)
   print('data')
   ```

   Our code is similar to the before, but now we've read the entire file. To test that this worked. We'll `print(data)`. Print statements in python require parenthesis around the  object you wish to print, here it is data. 
   
   Try `print('data')` as well, now Python will print the string `'data'`, as it did for the hello world function, instead of the information stored in the variable `data`.

   Don't forget to execute with `python mysci.py`

19. [python] Change the `mysci.py` script to read your whole data file using a context manager `with`:

   ```python
   # Read the data file
   filename = "data/wxobs20170821.txt"
   with open(filename, 'r') as datafile:
      data = datafile.read()
   
   # DEBUG
   print(data)
   ```
   
   Again this is a similar method of opening the datafile, but we now use `with open`. The `with` statement is a context manager that provides clean-up and assures that the file is automatically closed after you've read it. 
   
   The indendation of the line `data = datafile.read()` is very important. Python is sensitive to white space and will not work if you mix spaces and tabs (Python does not know your tab width). It is best practice to use spaces as opposed to tabs (tab width is not consistent between editors).
   
   Combined these two lines mean: with the datafile opened, I'd like to read it.

   And execute with `python mysci.py`.

20. [python] What did we just see?  What is the `data` object?  What type is `data`?  How do we find out?  
   
   Add the following to the `DEBUG` section of our script:
   
   ```python
   print(type(data))
   ```
   And execute with `python mysci.py`

   What did we just see?  What is the `data` object?  In the 'DEBUG' section of our script let's find out the type of our data object. Object types refer to 'float' 'integer' 'string' or other types that you can create. 

   Python is a dynamically typed language, which means you don't have to explicitly specify the datatype when you name a variable, Python will automatically figure it out by the nature of the data.

21. [git] Now, clean up the script by removing the `DEBUG` section, before we commit this to git.

22. [git] Let's check the status of our git repository

   ```bash
   $ git status
   ```
   
   Note what files have been changed in the repository.

23. [git] Stage these changes:

   ```bash
   $ git add mysci.py
   ```

24. [git] Let's check the status of our git repository,again.  What's different from the last time we checked the status?

   ```bash
   $ git status
   ```

25. [git] Commit these changes:

   ```bash
   $ git commit -m "Adding script file"
   ```
  
  Here a good commit message `-m` for our changes would be "Adding script file"

26. [git] Let's check the status of our git repository, now.  It should tell you that there are no changes made to your repository (i.e., your repository is up-to-date with the state of the code in your directory).'

   ```bash
   $ git status
   ```

27. [git] Look at the git logs, again:

   ```bash
   $ git log
   ```
   You can also print simplified logs with the `--oneline` option.

-------------
That concludes the first lesson of this virtual tutorial. 

In this section you set up a workspace by creating your directory, conda environment, and git repository. You 
downloaded a .txt file and read it using the Python commands of `open()`, `readline()`, `read()`, `close()`, and `print()`, as well as the context manager `with`. You should be familiar with the `str` datatype. You also used fundamental git commands such as `git init`, `git status`, `git add`, `git commit`, and `git logs`.

Please continue to [Part 1b](z230_p1b.ipynb).