# Introduction to Python Workshop Series 1: Getting Started
March 9 at 3:00 PM \
Vincent Scalfani and Lance Simpson \
*The University of Alabama Libraries* \
[Contact Information on UA Libraries Directory](https://www.lib.ua.edu/#/staffdir?liaison=1&department=Rodgers%20Library%20for%20Science%20and%20Engineering)


**Today, attendees will learn how to:**

* Work with Python in Google Colaboratory
* Format notebooks
* Get help in Python and view documentation
* Understand basic Python syntax and create variables
* Import Python libraries  
* Load data into Colaboratory notebooks
* Save and share notebooks


# Annoucements

March 23 at 3:00 PM \
**Introduction to Python Workshop Series 2: Functions, Loops, and Conditional Statements**

April 6 at 3:00 PM \
**Introduction to Python Workshop Series 3: DataFrames and Plotting Data**

More information and registration here: https://calendar.ua.edu/department/university_libraries/calendar




# What is Google Colaboratory (Colab) [1]?

1. Online Jupyter based environment for Python programming

2. It is free.

3. No setup or local installations are necassary. Code is run on a temporary virtual machine provided by Colab service.

4. You can use Chrome, Firefox, or Safari. Other browsers may be supported.

3. A "Pro" version is available if more computing resources are needed.




[1] See the Google Research FAQ: https://research.google.com/colaboratory/faq.html


# Setup For Today

If you would like to follow along intractively with us today:

1. Go to the link provided in Zoom chat for this Colab notebook. 

2. Save a copy to your Google Drive. You should then be able to run and edit the code interactively.

# Getting Started
For good luck and to make sure our Colab Notebook connects to a virtual machine:

In [104]:
print("Hello World!")

Hello World!


## Working with Text in Colab

### In Code Cells

In [105]:
# This is a Code cell
# Python will ignore any text after a `#` symbol
# Text comments can be on their own line like this:

# Let's say Hi!
print("Hello World!")

Hello World!


In [106]:
# Text comments can also be placed after code on same line:
print("Hello World!") # Let's say Hi!

Hello World!


### In Text Cells [2,3]

This is a Text Cell. Text Cells use Markdown formatting, which is a markup language [2,3]. The basics are straightforward:

```markdown
**bold text**
```
**bold text**

\

```markdown
*Italics*
```
*Italics*

\

Headings use `#` and automatically populate the Colab Table of Contents.

```markdown
# Heading 1
## Heading 2 (under 1)
### Heading 3 (under 2)
```

# Heading 1
## Heading 2 (under 1)
### Heading 3 (under 2)

\

List markup:

```markdown
1. Item one
2. Item two
3. Item three
```
1. Item one
2. Item two
3. Item three

Bullets:

```markdown
* Item one
* Item two
* Item three
```
* Item one
* Item two
* Item three

\


Links can be added as standard URLs:

https://github.com/vfscalfani/UALIB_Workshops

Or they can be renamed like this:

```markdown
[UALIB GutHub Repository](https://github.com/vfscalfani/UALIB_Workshops)
```

[UALIB GutHub Repository](https://github.com/vfscalfani/UALIB_Workshops)

\

Code blocks can be incorporated with three backticks.


````
```python
print("Hello World!")
```
````

```python
print("Hello World!")
```

A single backtick,  `` ` ``, is used for inline code:

The Python function \`print`.

The Python function `print`.

\

**References**

[2] https://colab.research.google.com/notebooks/markdown_guide.ipynb

[3] https://docs.github.com/en/github/writing-on-github/basic-writing-and-formatting-syntax



## Interrupt Code Execution [4]

Before we get too far, it is useful to know how to stop code execution in Colab. Consider this simple example where we print numbers 0-99 wth a 1 sec delay between each print. What if we made a mistake or the code is taking too long?

In [None]:
# Stop a code execution with:
# Select Runtime > Interrupt Execution or click on stop.
import time
nums = list(range(1,100))
for num in nums:
  print(num)
  time.sleep(1)

[4] https://colab.research.google.com/notebooks/basic_features_overview.ipynb

# Getting Help in Python

## Web Documentation

We recommend starting out with viewing the online web-based documentation for Python: https://docs.python.org/3/. See also the Library Reference section for built-in Python functions: https://docs.python.org/3/library/index.html

\
Moreover, when using new libraries, start with the available online documentaion, for example, here is the documentation for the pandas data analysis and manipulation library: https://pandas.pydata.org/docs/

## help() function [5,6]



In [108]:
# If you know the name of the function, use the help() function 
# to display the docstring
help(sorted)

Help on built-in function sorted in module builtins:

sorted(iterable, /, *, key=None, reverse=False)
    Return a new list containing all items from the iterable in ascending order.
    
    A custom key function can be supplied to customize the sort order, and the
    reverse flag can be set to request the result in descending order.



In [109]:
# using a question mark will show the help in a pop-up Colab window
sorted?

In [110]:
# You may have noticed that Colab shows recommedations as you type.
# These can be useful, especially for exploring operations of objects.

a = [6,3,0,5] # this creates a list

In [111]:
# now if we type `a` followed by a period `.`, the available attributes/methods
# will show up. We can select one we are interested in and add a ? for info
a.pop?

For more help information, see:

[5] https://jakevdp.github.io/PythonDataScienceHandbook/01.01-help-and-documentation.html

[6] https://colab.research.google.com/notebooks/basic_features_overview.ipynb

## dir() function [7]

In [None]:
# The Python dir() function is useful for exploring modules [7].
# this prints a list of available functions and variables
# example with time module
import time
dir(time)

[7] https://stackoverflow.com/questions/139180/how-to-list-all-functions-in-a-python-module

Note that if you are unfamiliar with the module, it may be eaiser to explore the module on the web: https://docs.python.org/3/library/time.html?highlight=time#module-time

In [114]:
# now we can use help() to get more information about a particular function
help(time.sleep)

Help on built-in function sleep in module time:

sleep(...)
    sleep(seconds)
    
    Delay execution for a given number of seconds.  The argument may be
    a floating point number for subsecond precision.



# Basic Python Syntax [8]

In [115]:
# assignments
h = 5

In [116]:
# You can continue statements on a new line by enclosing with parentheses
k = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9
     + 10 + 11 + 12 + 13 + 14 + 15)

In [117]:
# Parentheses are also used for grouping
5 * (1 + 1)

10

In [118]:
# note that output is suppressed by defualt
# use the print function or type the variable name
k

120

In [119]:
# each statement is generally written on a new line
a = 1
b = 2

# alternatively, you can terminate statements with a semicolon
# and put multiple statements on one line, like this:

a = 1; b = 2

In [120]:
# Indentation is important

x = 1
if x < 2:
   y = x + 1
   print(x) # here print(x) will only be executed if x < 2

1


In [121]:
x = 20
if x < 2:
   y = x + 1
print(x) # here print(x) is outside the block and will always be executed.

20


In [122]:
# note that spaces on the same line do not matter
x = 1 + 10
x

11

In [123]:
x = 1       +     10
x

11

In [124]:
# functions are called with parentheses
print("Hello World!")

Hello World!


In [125]:
# empty parentheses are used when a function should be evaluated
# even with no argument input

myList = [23, 1, 45, 9]
myList.reverse() # the () indicates to evaluate reverse function with no arguments
print(myList)

[9, 45, 1, 23]


[8] Basic Python Syntax section adapted from https://nbviewer.jupyter.org/github/jakevdp/WhirlwindTourOfPython/blob/master/02-Basic-Python-Syntax.ipynb

# Python Variables [9,10]

Python uses the `=` symbol to assign values to names.

## Simple Variables


### Integers


In [126]:
x = 5
print(x)

5


In [127]:
# use the type function to view type
type(x)

int

Variables are stored between cells

In [128]:
print(x)

5


In [129]:
# overwrite x
x = 7
print(x)

7


### Floating Point Numbers

In [130]:
a = 1.825
print(a)

1.825


In [131]:
type(a)

float

In [132]:
# we can also use the float function
b = float(2)
print(b)

2.0


In [133]:
type(b)

float

### Strings

Python uses single `' '` or double quotes `" "` to create strings

In [134]:
s1 = "Thanks for coming to our workshop!"
print(s1)

Thanks for coming to our workshop!


In [135]:
s2 = 'single quotes work too'
s2

'single quotes work too'

In [136]:
type(s2)

str

In [137]:
# double quotes are sometimes required:
s3 = "Wow, that's awesome"
print(s3)

Wow, that's awesome


We can index into strings using brackets

In [138]:
# get the first character (zero based indexing in Python)
s3[0]

'W'

In [139]:
# get the last character
s3[18]

'e'

In [140]:
# get the last character, use -1 for end
s3[-1]

'e'

In [143]:
# get a range of charcters using a :
# values between indices
s3[0:3]

'Wow'

In [None]:
# There are many built in functions and methods for strings
help(str)

In [145]:
# length
len(s3)

19

In [146]:
# create a title version
s1.title()

'Thanks For Coming To Our Workshop!'

In [147]:
# lowercase all text
s1.lower()

'thanks for coming to our workshop!'

In [148]:
# replace text
s1.replace('workshop','zoom lesson',1)

'Thanks for coming to our zoom lesson!'

There are other built in Python types including complex, boolean, and None. See the Python documention for [Built-in Types](https://docs.python.org/3/library/stdtypes.html).

## Compound Variables

### Lists

Lists are one of the sequence types in Python and are created by separating values with a comma and enclosing in brackets, `[]`.

In [149]:
# create a list with numbers
myL = [10, 20, 20, 30, 100, 10, 50]
print(myL)

[10, 20, 20, 30, 100, 10, 50]


In [150]:
type(myL)

list

In [151]:
# create a list with strings
myS = ["apple", "orange", "pear", "blueberry"]
print(myS)

['apple', 'orange', 'pear', 'blueberry']


There are many built-in sequence operations described in the [Python Documention for Sequence Types](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range). Let's try a few:

In [152]:
# check if an item is in a list
5 in myL

False

In [153]:
"pear" in myS

True

In [154]:
# get length of list
len(myL)

7

In [155]:
# get min of list
min(myL)

10

In [156]:
# count occurences of a value
myL.count(20)

2

In [157]:
# concatenate a value or another list
t = myL + [70, 60]
print(t)

[10, 20, 20, 30, 100, 10, 50, 70, 60]


In [158]:
# append a value
myL.append(80)
myL

[10, 20, 20, 30, 100, 10, 50, 80]

In [159]:
# sort a list
myL.sort()
print(myL)

[10, 10, 20, 20, 30, 50, 80, 100]


In [160]:
myS.sort()
print(myS)

['apple', 'blueberry', 'orange', 'pear']


In [161]:
# reverse a list
myL.reverse()
print(myL)

[100, 80, 50, 30, 20, 20, 10, 10]


### Lists can contain a mix of types

In [162]:
# you can mix types of data and even have lists within lists!
R = [20, 1.825, 'a string', [37,11,22,41]]
print(R)

[20, 1.825, 'a string', [37, 11, 22, 41]]


### List Indexing

Python indexing starts at 0 from left to right. When indexing from right to left, the indexing starts at -1.

In [163]:
myL = [10, 20, 20, 30, 100, 10, 50]
print(myL)

[10, 20, 20, 30, 100, 10, 50]


In [164]:
# get the 1st value
myL[0]

10

In [165]:
# get the 5th value
myL[4]

100

In [166]:
# get the last value
myL[-1]

50

In [167]:
# get the second to last value
myL[-2]

10

In [168]:
# get the first four values with a slice
myL[0:4] # 4 because we are getting values within the index values

[10, 20, 20, 30]

In [169]:
# replace a value
myL[1] = 99
print(myL)

[10, 99, 20, 30, 100, 10, 50]


In [170]:
# replace values
myL[4:7] = [1000, 1001, 1002]
print(myL)

[10, 99, 20, 30, 1000, 1001, 1002]


In [171]:
# to index lists within lists, add a second set of brackets
# for example, get the i character within 'a string'

R = [20, 1.825, 'a string', [37,11,22,41]]
R[2][5]

'i'

In [172]:
# index out last two elements of [37,11,22,41]
R[3][2:4]

[22, 41]

Python has a variety of other compound data structure types such as tuple, dictionary, set, and more [10]. See the [Python docuemntation](https://docs.python.org/3/library/index.html)


Python Variables sections adapted from:

[9] http://swcarpentry.github.io/python-novice-gapminder/

[10] https://github.com/jakevdp/WhirlwindTourOfPython (see chapters 6 and 7)

# Importing Python Libraries [11]



What is a Python Library [11]?


* Python Libraries are collections of files (modules) containing functions and variables.

* Colab has many Python libraries that can be imported directly such as [NumPy](https://numpy.org/) and [pandas](https://pandas.pydata.org/), which are often used with data analysis.

* External libraries can be found on the [Python Package Index (PyPI)](https://pypi.org/).

Python [Built-in functions](https://docs.python.org/3/library/functions.html) are always available and generally do not need to be explicity imported. However, there are modules within the [Python Standard Library](https://docs.python.org/3/library/index.html) (e.g., math functions, file format handling, etc.), where we need to import the library module before we can use it.

We can use `import` to load a library module:

In [None]:
import math
# check the help documentation
help(math)

Let's try some of the math functionality. We can refer to functions and other things in the module by using a period, `.`, followed by the function or constant name.

In [175]:
# return absolute value
math.fabs(-1.2)

1.2

In [176]:
# return square root
math.sqrt(9)

3.0

In [180]:
# If we leave off math., Python will report a NameError
sqrt(9)

3.0

We can import specific parts from the module using `from`. This then allows us to refer directly to, for example, a specific function:

In [181]:
from math import sqrt
sqrt(9)

3.0

In [182]:
# We can also import more than one item at a time
from math import sqrt, fabs
c = sqrt(9)
d = fabs(-1.2)
print('c is', c)
print('d is', d)

c is 3.0
d is 1.2


An alias can be created for library modules.

In [183]:
import math as m
# now instead of math., we use the alias, m.
m.sqrt(9)

3.0

In [184]:
# you will almost always see certain libraries (e.g., numpy) 
# imported as an alias
import numpy as np

Importing specific parts of a module and creating aliases can help shorten programs and (sometimes) improve code readability.


## Libraries not in Colab

If you find that a library you need is not available directly in Colab, see the documention for [Importing a library that is not in Colaboratory](https://colab.research.google.com/notebooks/snippets/importing_libraries.ipynb). You can use `pip` or `apt-get` called from shell (`!`).

[11] This section on libraries was adapted from http://swcarpentry.github.io/python-novice-gapminder/06-libraries/index.html

# Loading Data into Colab

We will review an introductory method, where we can load files from our local system into the Colab virtual machine using the file navigation window. Other more advanced options such as connecting to your Google Drive are described in the Colab Documentation on [External data](https://colab.research.google.com/notebooks/io.ipynb).


**Important** 

All data uploaded to Colab is temporary and deleted when the virtual machine is deleted (~12 hours or less). [See the Colaboratory FAQ](https://research.google.com/colaboratory/faq.html).




**Basic File Navigation**

When working with data files in Colab, it is useful to know a few unix shell file navigation operations [12]. Shell commands are executed in Colab/Jupyter with the `!` symbol before the command, like `!pwd` to print the working directory, but Jupyter/Colab also has system aliases for many common operations, so you may not need to use `!` before the operation [13,14].






**Print the working directory:**

In [None]:
pwd

**List the items in the content folder:**

In [None]:
ls

Click on the File Navigation Window on the left to see the content folder. We can probably upload data to `/content` or `sample_data`, but it might be nice to create our own folder for organization. We can use `mkdir` to create a new folder directory. Let's call it `workshop_test`.

In [187]:
mkdir workshop_test

**List the items in content folder to see our new folder**

In [None]:
ls

**Uploading Files**

Next, we can use the file navigation window to upload data directly to our new folder. Click on the workshop_test folder `three dots > upload`. We'll upload a sample file called alkanes.txt, which is available on the [UALIB_Workshops Python data folder](https://github.com/vfscalfani/UALIB_Workshops/tree/master/03_Python/data). Click on the alkanes.txt file, then right click on `Raw > Save Link as`.

**Change directories into our new folder:**

In [None]:
cd workshop_test

**List contents to see our file:**

In [None]:
ls

**View the alkanes.txt file content with unix `cat`:**

In [191]:
cat alkanes.txt

Methane 	CH4 	16.043
Ethane 	C2H6 	30.07
Propane 	C3H8 	44.1
Butane 	C4H10 	58.12
Pentane 	C5H12 	72.15
Hexane 	C6H14 	86.18
Heptane 	C7H16 	100.2
Octane 	C8H18 	114.23
Nonane 	C9H20 	128.25
Decane 	C10H22 	142.28


Now if we want to use this data within Python, we can read the data in with the [Python csv import](https://docs.python.org/3/library/csv.html) or use a library like Pandas to read the data into a dataframe (we'll look at this more in Workshop 3). Preview below:

In [192]:
import numpy as np
import pandas as pd
alkane_df = pd.read_csv('alkanes.txt', names=['alkane_name', 'mol_formula',
                                              'mol_weight'], sep = '\t' )
alkane_df

Unnamed: 0,alkane_name,mol_formula,mol_weight
0,Methane,CH4,16.043
1,Ethane,C2H6,30.07
2,Propane,C3H8,44.1
3,Butane,C4H10,58.12
4,Pentane,C5H12,72.15
5,Hexane,C6H14,86.18
6,Heptane,C7H16,100.2
7,Octane,C8H18,114.23
8,Nonane,C9H20,128.25
9,Decane,C10H22,142.28


[12] See, for example, http://swcarpentry.github.io/shell-novice/

[13] https://colab.research.google.com/notebooks/basic_features_overview.ipynb

[14] https://jakevdp.github.io/PythonDataScienceHandbook/01.05-ipython-and-shell-commands.html#Shell-Related-Magic-Commands

# Save and Share Notebooks

1. Can share notebooks similarly to other Google services with the 'Share' button, which offers a variety of permissions options.

2. Can download Colab notebooks as .ipynb and/or .py format to upload to cloud services (e.g., UA Box) or code repository sites like GitHub. There is also a direct option to save a copy to [GitHub or Google Drive](https://colab.research.google.com/github/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb).

> Tip: Consider backing up the .py file too as this can be easily viewed in any text editor.

# Python Learning Resources

We recommend the following resources as a start for further reading. Some content (as referenced and attributed to above) in this workshop have been adapted and derive from them:


[1] https://github.com/jakevdp/WhirlwindTourOfPython

[CC0-1.0 License](https://github.com/jakevdp/WhirlwindTourOfPython/blob/master/LICENSE)

\

[2] http://swcarpentry.github.io/python-novice-gapminder/

[CC-BY-4.0 License](http://swcarpentry.github.io/python-novice-gapminder/LICENSE.html)

\

In addition, UA Libraries provides access to many Python eBooks. Use [Scout](https://www.lib.ua.edu/scout/) to discover Python eBooks. Start with a search for `python` and limit to ebooks within computer science discipline.











# Questions

That's it for today, thank you!

# Notebook Copy

An archived version of this notebook with (most) outputs is available on our UALIB_Workshops GitHub repository: https://github.com/vfscalfani/UALIB_Workshops