# What is Python?

"Python is an interpreted, object-oriented, high-level programming language with dynamic semantics." is what you get from the python website. But to a beginner, this is all mumbo-jumbo. Even for me, I have to look up what each phrase means. Let's break this down:

* Interpreted: Python code is interpreted rather than compiled, meaning that another program (called the interpreter) converts the code to the language and set of commands that the computer processor can understand. The advantage here is that python is more user-friendly and the code you write doesn't need to be compiled to interact with the computer processors's native language.  


* Object-oriented: In addition to simple variables, arrays (multi-dimensional matrices), and lists (one-dimensional collection of ordered items) that can be declared in python, python also features objects. An object can be thought of as a code container that has specific variables and functions within; the advantage is that these variables/functions can be reused multiple times without affecting each instance. We will get further into this later.


* High-level: This concept is similar to the first feature: python language/syntax is abstracted from the "machine code" that the computer processor understands. This means that although there's a slight overhead in compute time, it is much easier to understand and code in Python than say computer binary.


* Dynamic semantics: This means, in a simple example, that values can be assigned to variables without explicit declaration of the data type. This feature is not terribly important to beginner coders, but this is in contrast to Java and C where variables need to initialized first with information about data type (eg. int, float, etc.) Dynamic vs static semantics can be a very complicated topic.

### More importantly for you, Python is useful because:

* It's open source and highly customizable: the python community and userbase develops custom code for python that anyone can install on their computers. This allows you to mix and match different toolboxes/packages to suit your project's analytical needs.


* There are many resources to use/learn basic python


* Python typically requires less coding to perform tasks.


* It's object-oriented, which allows for modularization (ability to break code up into easier to understand, smaller components) and reduces redundancy of code.

https://www.python.org/doc/essays/blurb/

### Other basic facts about python that are important to know:

* Lines of code in python are largely executed sequentially

* Indexing in python starts with 0

* You don't need to end each line with a semi-colon (as Matlab does)

* However indentation of lines for conditions and loops (we'll talk about these later) is important!


## Jupyter notebooks

In contrast to writing your script in a python (.py) file, Jupyter notebooks contain cells that you write code into and can execute to display output immediately below the cell. You can have as many cells as you want and it's good practice to split your code into easy-to-understand chunks using the cells (to challenge yourself, try to keep each cell less than one or two dozen lines of code!). Accordingly, Jupyter notebook is a fantastic medium for writing and testing code. 

__Markdown__: As you probably noticed, there are cells that don't contain code, but rather comments (such as this one you are reading). These are called markdown cells and are useful for adding description or documentation; these execute and display their contents in place.

__Some common Jupyter keyboard shortcuts:__

Click within the cell : allows you to edit code within the cell

Click in the white space margin left of the cell: selects the cell as a whole and allows you to execute keyboard shortcuts detailed below:

`shift-enter` (also works if in text edit mode): this will execute the code within the cell

`a` : insert empty cell above current

`b` : insert empty cell below current

`d, d` (double press): deletes the current cell

`z` : undo previous delete cell

`m` : changes cell to markdown mode

An aside: For people writing code and large programs for deployment to the community, it's common practice to develop and test in Jupyter, but the code will ultimately be transferred to python executable scripts. _For scientists who aren't extensively writing code and are analyzing data for visualization, it is my opinion that Jupyter notebook suffices as the sole medium for writing and executing code_.

In this python primer series, you will learn about the following topics:

1. Strings and file loading

2. Conditionals 

3. Lists and slicing

4. Dictionaries

5. Loops

6. Np arrays and slicing

7. Functions

8. Pandas

9. Matplotlib

## Let's start messing around with cells and code

Jupyter notebooks will only show the output of the last line of code in a cell. Exceptions to this are 
* if a semi-colon is positioned at the end of the last line, it will suppress the automatic last line output
* if there are print functions present 

In [None]:
1+1
1+2
1+3

In [None]:
1+1
1+2
1+3;

In [None]:
print(1+1)
print(1+2)
print(1+3)

#### Integers
- Integers are just 'whole numbers': `1`, `42`, `-10000`.

#### Float
- Floats are numerical values without the restriction to whole numbers: `-3.1415`, `2.71828`, `1.618`

When you start working with large numerical datasets, it's important to note that float numbers take up more memory and computer storage space.

In [None]:
42

In [None]:
type(42)

In [None]:
3.1415

In [None]:
type(3.1415)

In [None]:
int(3.1415)

<a id='string'></a>
### String
- A _string_ is a sequence of characters: `'hello world'`, `'subject1'`.
- Define using either single (`'`) or double (`"`) quotes. Be sure to open and close with the same one though.

In [None]:
'single quotes'

In [None]:
"double quotes"

In [None]:
this is not a string

In [None]:
type('hello world')

In [None]:
'42' + 2

In most general cases, it's okay to use single quotes. One specific case that double quotes should be used over single is if the contents of the string consists of quotes. For example:

In [None]:
'They said 'She's a coding god''

### Merging/concatenating strings

In [None]:
'TATATA' + 'ATGCGCG'

In [None]:
'Neuron #{} fired {} spikes'.format(2, 45)

In [None]:
# only in python 3, a simpler way to do above
f'Neuron #{2} fired {45} spikes'

## Variables

All lines of code up until now are not saved and __python has no memory of what was executed once a line of code has been completed__. Variables are universal to coding languages and help introduce persistency across the code. The convention for assigning code to a variable is as such:

`variable_name` = _expression to be saved_

where `variable_name` is arbitrarily defined by you, the coder, and the expression to the right of the equal sign is a valid python expression.

##### __*Conventions for variables*__ (originally from UNC NBIO 750 course created by Vijay Namboodiri and Randall Ung)

Variable names can be a combination of characters, numbers, or `_` of any length.

What to avoid in names:
- Starting with a number
- Special characters other than `_` (definitely avoid using spaces, periods, or dashes in path or file names; doing so may result in runtime errors
- Python keywords

Typically, variables are formatted as words separated by an underscore: `my_new_var` or in camel-case: `myNewVar`

Do NOT use names such as `my.new.var` or `my-new-var`

Check out the <a href='https://www.python.org/dev/peps/pep-0008/'>PEP style guide</a> for more information.

In [1]:
a = 1+1
b = 1*10
my_random_variable = a/4
myrandomvariable = 'hello world'

In [3]:
print(a)
print(b)
print(my_random_variable)
print(myrandomvariable)

2
10
0.5
hello world


In [None]:
print(a, type(a))
print(b, type(b))
print(my_random_variable, type(my_random_variable))
print(myrandomvariable, type(myrandomvariable))

As you can see, variables `a` and `b` are ints (integers), `0.5` is a float (number with fractions), and `myrandovariable` contains a string (alphabets)

You notice that the output of `type()` also prints out "class". This is related to the object-oriented aspect of python. 

Python knows and assigns `2` and `10` to the int class; a class is an object that holds the same callable functions regardless of the value it's assigned. 

For example. the int class specifically has functions for adding and subtracting (that for example the string class does not have). 

In [None]:
# note: variable names cannot start with numbers
2fast2furious = 'family'

Variables can also be overwritten:

In [None]:
print(b)
b = 'Now b is a string'
print(b)

For number variables, we can apply mathematical operations in an elegant pythonic fashion:

In [9]:
x = 3

x += 2 
print(x)

5
25


As opposed to x = x + 2

In [None]:
x *= 5
print(x)

# Indexing/slicing strings

Elements in a string can be referenced by indexing.
- Indexing starts at 0
- Slicing is done by defining start and end: `my_string[start:end]`
- End index in slicing is __NOT__ included. For example `my_string[0:3]` will only return the 0th, 1st, and 2nd items in a string.

Side note: indexing and slicing will be important later on when we work with other data formats like lists.

In [None]:
print("h e l l o   w o r l d")
print("0 1 2 3 4 5 6 7 8 9 10")
print("")
print(" h   e   l  l  o     w  o  r  l  d")
print("-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1")

In [None]:
myrandomvariable = 'hello world'

In [None]:
print(myrandomvariable)
print(myrandomvariable[0:5])
print(myrandomvariable[:5]) # go up to the 5th entry (from the beginning); automatically uses first index as start

In [None]:
print(myrandomvariable[6:]) # go from 6th entry onward; automatically uses last index as end
print(myrandomvariable[-5:])
print(myrandomvariable[:-5]) # go up to the entry 5th from the end 


In [None]:
print(myrandomvariable[0:11:2]) # start_index:end_index:interval, get every other letter
print(myrandomvariable[::2]) # automatically use first and last index
print(myrandomvariable[1::2]) # start on index 1, get every other letter

# Basic error handling

In [None]:
# mispelling a variable will result in an undefined error
my_random_variabl

In [None]:
prin('Prin is not a built-in function')

In [None]:
"2" + 2

In [None]:
myrandomvariable[100]

In [None]:
import notamodule

In [None]:
if 2==2

Google is your friend! StackOverflow is a forum where anyone can post coding questions/errors and get knowledgable answers/solutions. Usually with the appropriate google query, you can find your answer on the first page of google.

# Imported packages
But what if there are more complex computations or procedures that we want to execute, and are generally outside of the scope of our skill or time to code up? There are codebases that other folks have written and have published them for the public to use. These open-source sets of codes that have been formatted to be readily installed in our anaconda environments are called packages. These packages may be very nebulous at first, but the organization and structure is not terribly complicated - basically a bunch of core python files with a main python file that facilitates loading.

There are a whole range of open-source, actively-maintained packages that address a common, unique theme. For example, some folks in the past found that a set of codes to load in and organize tabular data from excel sheets would have been useful, so they developed the package called pandas. Same was true for the imagej python package.

To use these packages, we first have to download and install them into our python environment using the following lines executed in our anaconda prompt

`pip install package_name` ; for example we would use `pip install pandas` for the tabular/excel data package

The above step downloads and installs the package, but in order to use the package in our own python/jupyter notebook code, we have to "import" or in other words, load package's python scripts. To do this, in our code at the beginning, we need to add or execute this line of code:

`import pandas`

In short, packages are simply a set of python codes that some individual, group, or entity developed. Ultimately these codes were organized and deployed to a publically available internet repository. Pypi, the database that pip taps into, is the main publically available repository. Codebases that are under development can also be found in Github repositories.

In [None]:
import pandas

pandas.__version__

In [None]:
# you can give the package
import pandas as pd

pd.__version__

In [None]:
d = {'dogs': ['sheltie', 'poodle', 'basset', 'chihuahua'], 'cats': ['ragdoll', 'sphinx', 'persian', 'siamese']}

pd.DataFrame(data=d)

In [None]:
df_animal = pd.DataFrame(data=d)

In [None]:
df_animal.head(2)

Keep packages in the back of your mind for now - they will be important for the next lesson and its exercises.