# Introduction to Python
## NYU Health Sciences Library


# Why Python?
Python is an incredibly efficient programming language and allows you to do some impressive things with only a few lines of code! Thanks to Python’s syntax, we can write “clean” code that is easier to debug and allows for overall readability. Further, code written in Python is easily extendable and reusable, allowing you and others to build upon existing code. Python is used in a variety of contexts from game design to data analysis. It is also used a lot in academic research, especially in the sciences, and can be used for data collection (e.g., web scraping), data cleaning, data analysis, and visualization.


## Why Python for Scientific Computing?

Since it is relatively easy to use compared to other programming languages, Python has been widely adopted by researchers who don't want to spend a whole lot of time learning computer programming before they can start their actual project. Additionally, over the years an entire ecosystem of helper libraries (or "packages") has been developed to make scientific tasks even easier in Python. 

You may have already heard of some of these libraries, such as [PANDAS](https://pandas.pydata.org/) (which contains functions for data analysis), [SciPy](https://www.scipy.org/) (science and engineering), [NumPy](https://www.numpy.org/) (multi-dimensional data structures and high-level mathematics), [BioPython](https://biopython.org/) (biological computing), etc. There is some overlap between these libraries, and typically you will combine features from several different libraries to accomplish your research tasks.


## Python Environment

If Python is installed on your computer, you can interact with it on the command line, sometimes referred to as the "terminal", "console", "Python shell" or "REPL" (Read, Eval, Print and Loop). More often, people use a text editor, such as [Sublime](https://www.sublimetext.com/), or more sophisticated IDEs such as [PyCharm](https://www.jetbrains.com/pycharm/) to write and run code. With a lot of setups available, you have many options to choose from!

Today, we are using a browser-based Jupyter Notebook and Anaconda, which allows users to selectively run code cells and add rich text elements (paragraph, equations, figures, notes, links) in Markdown. With code, notes, instructions, and comments all in one place, it serves as a powerful resource and learning tool for you!

Try running each of the code blocks below using **`Shift-Enter` (Windows) or `Control-Return`** (Mac). Any text after the # symbol is a comment for humans, and will be ignored by Python. You can also try editing the code blocks and seeing the new results.

In [None]:
3 + 4   # addition

In [None]:
5 * 2   # multiplication

In [None]:
5 / 2 # division

In [None]:
5 ** 2  # exponentiation

In [None]:
True    # boolean: "True" and "False" are reserved keywords in Python, meaning you can't use them yourself as e.g. variable names

In [None]:
not True # negating a boolean

In [None]:
not true # booleans are case-sensitive, so this will throw an error

In [None]:
#Give it a try! 




## Python Syntax 
Python is sometimes loosely referred to as 'executable pseudocode' since it often uses easily recognizable words, which can be used to infer what is happening or what will happen. Take for example, the simple line of code below. What do you think will happen when we run it? 


In [None]:
print("Hello World")

Our inputs to Python can roughly be categorized as either commands (i.e., requests to do something), or definitions

In [None]:
# Two commands
print(4 + 7)

In [None]:
# Defining a variable (this produces no output)
my_location = "Health Sciences Library"

In [None]:
# Commanding Python to print the variable you just defined
print(my_location)

## What does this lesson cover? 

* Variables
* Basic Data Types
* Lists 
* Loops
* Dictionaries
* Conditional Statements
* Functions and Arguments
* Python Libraries

# A Note About This Class

Programming is a huge topic, and you can spend an entire lifetime learning and improving. So it's impossible for me to teach you everything you need to know in this one class. Therefore, I don't want you to get too bogged down in the details. It's OK if you leave this class without knowing the proper command for reversing a Python list. You may never have to use that command, and if you ever do, there is no shame in Googling it.

It's much more important that you leave this class knowing how and where to find help when you need it. In fact, learning to search the internet for help is a very important part of programming. No matter what you're trying to do with Python, someone else has already done it before, and has probably posted about it online to help others. You just need to Google the right words to find it!

Basically, the best way to learn Python is to find a problem that interests you, and that you need Python to solve. Once you are working towards a specific goal like this, the problems you face will become much smaller, and you'll know what questions you need to ask to accomplish your goal. For example, instead of your task being "Learn Python" (which can take an entire lifetime) your task will be, "how to get standard deviation of all numbers in a list in Python." That is much easier!

# Storing Data in Variables
A variable is assigned a *value*. Once assigned, the variable holds the information associated with that value. 
Variables are important in programming languages. In Python, the convention is to use descriptive variables to help make clear what you are trying to do in your code. Using clear variables can help you maintain your code and can help others read and understand your code.

In Python, we can assign a value to a variable, using the equals sign ```=```. 

In [None]:
diabetes_test = "A1C test"
print(diabetes_test)

You can think about it as:
> "The variable 'diabetes_tests' *gets* the value "'A1C test"

## Example
If we wanted to track the weight of a patient who weighs 60 kilograms, we could by assigning the value 60 to a variable weight_kg:

In [None]:
weight_kg = 60

In [None]:
patient_id = "001"

In [None]:
weight_lb = 2.2 * weight_kg

In [None]:
weight_lb = 2.2 * weight_kg
print(weight_lb, "lbs.")

In [None]:
patient_id = 'inflam_' + patient_id

The value assigned to a variable will remain the same until you alter it. The value can be changed or reassigned within the program.


Why can we do this? Because Python interprets one line at a time! Be careful, once a variable has been changed, it will hold that new value. When in doubt, don't re-use variable names unless you are absolutely sure that you don't need the old value any longer.
Rules for naming variables

- can only contain letters, numbers, and underscores
- can start with an underscore, but not a number
- no spaces, but an underscore can be used to separate words or you can use CamelCase
- cannot use the names of Python built-in functions or keywords
- should be short, but descriptive (glucose_level is better than gl or g_l)
- be consistent. If you start with CamelCase or snake_case, try to use it throughout


# Examples of acceptable variable names

```Patient_numbers = 
_supply_list = 
experiment_001_lab = 
glucose_level = ``` 

# Less helpful variable names

```t_1 = 
gl = 
_hom_grad = 
inventlist = 
bknr =```

# Data Types
You may have noticed that Jupyter Notebooks is highlighting the text values in red, and the integers in green. The colors correspond to the different data types of these values. There are four basic data types in Python (though many more are available).

*   string : `"Hello there"`
*   boolean (true/false) : `True`
*   integer : `4`
*   decimal (float) : `4.0`

You can check the type of any data using the `type()` function:

In [None]:
print(type(4))
print(type("Hello"))
print(type(3.14))

## Exercise

In [None]:
# Create a few helpful variable names relevent to your work
# Assign them numeric values or text values  
# Do a simple calculation and/or concatenate with strings
# Check the data type for each variable 






# Lists 
Lists are mutable ordered containers of other objects, meaning they can be modified and elements can be added and taken away. The elements in the list maintain their order (until modified). In Python, square brackets ([]) indicate a list, and individual elements are seperated by a comma. 

In [80]:
diabetes_tests = ["Fasting plasma glucose (FPG) test", "A1C test", "Random plasma glucos (RPG) test"]

In [81]:
print(len(diabetes_tests))

# Note: Python indexes start at 0, not 1
print(diabetes_tests[0])
print(diabetes_tests[1].title())

#Selecting a range
#Note: "2" indicates "to, but not including"
print(diabetes_tests[0:2])

#Selecting the last value 
print(diabetes_tests[-1])

3
Fasting plasma glucose (FPG) test
A1C Test
['Fasting plasma glucose (FPG) test', 'A1C test']
Random plasma glucos (RPG) test


## Add to a list


In [82]:
# Adding using "plus" and string
diabetes_tests = diabetes_tests + ["Glucose challenge test"]
print(diabetes_tests)
len(diabetes_tests)

['Fasting plasma glucose (FPG) test', 'A1C test', 'Random plasma glucos (RPG) test', 'Glucose challenge test']


4

In [83]:
#Adding using the append method
diabetes_tests.append("Gestational Diabetes")
len(diabetes_tests)

5

In [None]:
print(diabetes_tests[0:6])

In [None]:
# Insert at a specific position

Data_Services = ["Fred", "Genevieve", "Nicole"] 
Data_Services.insert(0, "Michelle")
print(Data_Services)

## Remove item by index

In [84]:
# Remove using the pop method 

diabetes_tests.pop(1)
len(diabetes_tests)



4

In [85]:
# Remove by value
# Note: this only removes the first value; to remote all values you will need to use a loop

diabetes_tests.remove("Gestational Diabetes")
len(diabetes_tests)

3

## Testing if Something is In a List 

In [None]:
if 'Glucose challenge test' in diabetes_tests:
    print("Yes, it's in the list!")
else:
    print("Nope, it's not included")

## Exercise

In [None]:
# Create a list of objects (e.g., names, values, equipment, etc)
# Use insert()to add somthing to the beginning of your list
# Use insert()to add somthing to the middle of your list
# Use append to add something to the end of your list 
# use pop() to remove an item 







# For Loops

# Dictionaries

# Conditional Statements

# Functions and Arguments

# Python Libraries

### Example 1: Plotting the Linear Regression of a Set of Random Numbers

### Example 2: Downloading a CSV into a Pandas DataFrame, and Exploring the Data

# Parting Thoughts


It's important to be able to read official documentation, but it's often not enough to help you understand how to accomplish your goal. For solutions to real-world problems, **look for Stack Overflow questions, tutorials from blogs and/or coding schools, and especially "cookbooks."**

And again: the best way to learn Python is to **find your problem first**, then use Python as the solution. Starting with a goal will help focus your efforts.

### Important science and math libraries
- https://pandas.pydata.org/
- https://www.scipy.org/
- https://www.numpy.org/
- https://biopython.org/
- https://bokeh.pydata.org/en/latest/ (very nice visualizations)
- https://www.tensorflow.org/ (machine learning)

### Python Operators
- https://www.programiz.com/python-programming/operators

### Lists
- https://stackoverflow.com/a/509295
- https://realpython.com/python-lists-tuples/
- https://docs.python.org/3/tutorial/datastructures.html

### Dictionaries
- https://realpython.com/python-dicts/

### Specialized Scientific Data Structures
- https://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html
- https://biopython.org/wiki/Seq
- https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python

### Loops
- https://www.learnpython.org/en/Loops
- https://www.datacamp.com/community/tutorials/for-loops-in-python
- https://realpython.com/python-while-loop/

### Functions
- https://www.datacamp.com/community/tutorials/functions-python-tutorial
- https://realpython.com/python-main-function/ (useful when you're writing a standalone script)

### MatplotLib Tutorial and Gallery
- https://matplotlib.org/3.1.0/tutorials/introductory/pyplot.html
- https://matplotlib.org/3.1.0/gallery/index.html

### Pandas
- https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html
- https://pandas.pydata.org/pandas-docs/stable/user_guide/cookbook.html
- https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python
- https://stackoverflow.com/questions/17071871/select-rows-from-a-dataframe-based-on-values-in-a-column-in-pandas?rq=1
- https://www.datacamp.com/community/tutorials/joining-dataframes-pandas

### More Resources and Online Courses
- https://www.datacamp.com
- https://realpython.com/
- https://docs.python-guide.org/intro/learning/
- https://www.coursera.org/learn/python
- https://www.codecademy.com/learn/learn-python-3

### Easily-Downloadable Datasets to Play With
Note: when downloading datasets from GitHub, be sure to click on the "raw" button near the top to get the unformatted data, which is the only kind Python can accept. This URL (which usually starts with https://raw.githubusercontent.com) can be used directly in the Pandas `read_csv()` function
- https://github.com/fivethirtyeight/data
- https://github.com/mwaskom/seaborn-data