<a href="https://colab.research.google.com/github/broadwell/broadwell.github.io/blob/master/intro_to_python_filled.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Python

## Info
- Peter Broadwell (CIDR), broadwell@stanford.edu
- Scott Bailey (CIDR), scottbailey@stanford.edu

## Goal

By the end of our workshop today, we hope you'll understand basic syntax in Python for variables, functions, and control flow, and understand some of the basic data structures in Python. With these in hand, you'll know enough to write basic scripts and explore other features of the language.  

## Topics
- Variables and types/structures (String, Int, Float, List, Dictionary)
- Functions
- Control flow
- Reading and writing text to a file 
- A basic workflow of reading in content, doing something to it, then writing a new output

##  Setup

Go to http://bit.ly/cidr-intro-python-19

To access the link, you will need to be signed in to [Google Drive](https://drive.google.com) with your SUNet account. If access is denied, check to make sure you aren't signed in to a different Google account by clicking the icon at the top right of the browser window.

This is the **intro_to_python_filled.ipynb** notebook, which gives the expected output of each command in the tutorial, as well as solutions to the coding activities. We recommend opening the other notebook, **intro_to_python_filled.ipynb**, and using it during the tutorial, although you can also refer back to this notebook.

Once you have selected a notebook, you should click the "Open with Colaboratory" link that appears at the top of the page. You'll then want to make sure that you have copied the notebook to your own Drive account. This can be accomplishd by clicking the "Open in Playground" button and then the "Copy to Drive" buttons, if available, or else by selecting a similar option under the "File" menu. This copy of the notebook is now attached to your own user account, so you can edit it in any way you like -- you can even take notes directly in the notebook.

## Why Python?

It's multi-use: you can write simple scripts to automate tasks, write complex code for machine learning and other approaches, and even build full-scale web applications.

The biggest reason we see people learning Python right now is for data science and related approaches, regardless of disciplinary background.

## Jupyter Notebooks and Cloud Services

Jupyter notebooks are a way to write and run Python code in an interactive way. They've quickly become a standard tool for putting together data, code, and written explanation or visualizations into a single document that can be shared. There are a lot of ways that you can run Jupyter notebooks, including just locally on your computer, but for this workshop, we're using Google's "Colaboratory" platform. It is  similar to services from Microsoft and Amazon, all of which provide the same core functionality of running Jupyter notebooks in the cloud. Google "Colab" offers a virtual environment that can contain a notebook and static files and comes with a variety of popular libraries pre-installed. It also allows you to install other libraries as needed, and even gives you access to cloud-based GPUs for deep-learning applications. Be aware, however, that files won't persist across sessions unless you download them or save them to Drive.

Using the Colab platform allows us to focus on learning and writing Python in the workshop rather than on setting up Python, which sometimes can take a bit of extra work depending on operating systems and other aspects of the computing environment. If you'd like to install a Python distribution locally, though, we have some instructions (with gifs!) on installing Python through the Anaconda distribution, which will also help you handle virtual environments: https://github.com/sul-cidr/python_workshops/blob/master/setup.ipynb

If you run into problems, or would like to look at other ways of installing Python, feel free to send us an email. 

## Variables and types

In [0]:
# Strings
greeting = "Hello, I'm Peter. It's a pleasure to meet you."
# After you run this cell, note the difference between printing out in Jupyter and getting the
# output from the last line of the cell
print(greeting)
greeting

Hello, I'm Peter. It's a pleasure to meet you.


"Hello, I'm Peter. It's a pleasure to meet you."

In [0]:
# Find a letter by index
greeting[3]

'l'

In [0]:
# Get the length of a string. Length here is a built-in function in Python
len(greeting)

46

In [0]:
len?

In [0]:
# Count spaces in the string. Here, count() is a method that all strings have, i.e.,
# a function that can be run on the string.
greeting.count(' ')

8

In [0]:
greeting.count?

In [0]:
# Slice to get the first 3 characters
greeting[:3]

'Hel'

In [0]:
# Get the last three characters
greeting[-3:]

'ou.'

In [0]:
# Replace hello with goodbye
greeting.replace("Hello", "Goodbye")

"Goodbye, I'm Peter. It's a pleasure to meet you."

In [0]:
# String concatenation
"Hello" + " " + "World"

'Hello World'

In [0]:
# Use tab in Jupyter notebooks to explore functionality
# greeting.

In [0]:
# Numbers
# Integer and floats
first_num = 10
second_num = 5.467
print(type(first_num), type(second_num))

<class 'int'> <class 'float'>


In [0]:
# Addition
1 + 5

6

In [0]:
# Division
10 / 2

5.0

In [0]:
# Multiplication
5 * 2

10

In [0]:
# Lists
drinks = ['coffee', 'tea', 'water']
drinks

['coffee', 'tea', 'water']

In [0]:
# Python allows you to create lists containing elements of different types
mixed = [2, 'hello', 10.5, 'here is a sentence']
for item in mixed:
    print(type(item))

<class 'int'>
<class 'str'>
<class 'float'>
<class 'str'>


In [0]:
# Get item by index
drinks[2]

'water'

In [0]:
# Add an item to the end of the list
drinks.append('juice')
# Modify an element "in place" via its index
drinks[0] = "hot water"
drinks

['hot water', 'tea', 'water', 'juice']

In [0]:
# Splitting a string - note the type of the output
greeting_words = greeting.split(' ')
greeting_words

['Hello,', "I'm", 'Peter.', "It's", 'a', 'pleasure', 'to', 'meet', 'you.']

In [0]:
# Joining a list of strings 
' '.join(greeting_words)

"Hello, I'm Peter. It's a pleasure to meet you."

In [0]:
# Dictionaries are another useful data type. They associate "keys" (usually strings or integers)
# to values, which can be anything: ints, strings, floats, lists, other dictionaries
majors = { 'Peter': 'Musicology', 'Scott': 'Religious Studies', 'Quinn': 'Slavic Studies', 'Ron': 'Economics' }
majors['Peter']

'Musicology'

There are plenty of other data types and structures that we aren't going to use today, such as: sets, tuples, and so forth. 

## Functions

At the most basic level, functions are chunks of reusable code.

In [0]:
# Define a function
def add(num1, num2):
    return num1 + num2

add(1, 2)

3

In [0]:
def combine_arrays(array1, array2):
    new_list = array1 + array2
    return new_list

first = ['hello', 2]
second = [1, 10]
new = combine_arrays(first, second)
new

['hello', 2, 1, 10]

### Activity

Pig latin is a language game where you take the first letter of a word, move it to the end of the word, then add '-ay' at the end. For example, 'pig latin' would be 'igpay atinlay' and 'python' would turn into 'ythonpay'.

In the cell below, write a function that takes a string, lowercases it, and returns the pig latin translation of the word. You'll need to use slicing and string concatenation to make this work.

In [0]:
def pig_latinize(word):
    word = word.lower()
    word1 = word[1:] + word[:1] + 'ay' 
    return word1

# the following should return 'ellohay'
pig_latinize('Hello')

'ellohay'

## Control flow 

In [0]:
# IF (conditional execution)
name = "Bob"

if name == "Peter":
    print("Hi Peter!")
else:
    print("Who are you?")

Who are you?


In [0]:
# You can use control flow with functions
# Also, you can use if, else if, and else to specify more than one condition
name = "Bob"

def say_hello(name):
    return "Hello, " + name + "!"

if (name == "Bob"):
    message = say_hello("Bob")
    print(message)
elif (name == "Peter"):
    message = say_hello("Peter")
    print(message)
else:
    print("Who are you?")

Hello, Bob!


In [0]:
# FOR loops let you iterate over a list or other iterable object
names = ["Vijoy", "Claudia", "Scott", "Peter"]
for name in names:
    print(name, len(name))

Vijoy 5
Claudia 7
Scott 5
Peter 5


In [0]:
# You can combine types of control flow
for name in names[:3]:
    if len(name) > 5:
        print(name)

Claudia


In [0]:
# Testing list membership
if ('Peter' in names):
    print("Peter is here")
else:
    print("Peter is missing!")

Peter is here


In [0]:
def add_one(num):
    return num + 1

nums = [1, 2, 3, 4]
plus = []
for num in nums:
    plus.append(add_one(num))
#plus = list(map(add_one, nums))
plus

[2, 3, 4, 5]

In [0]:
# ADVANCED: List Comprehensions
# List comprehensions are a "pythonic" way of building lists in a compact manner

added = [add_one(num) for num in nums]
added

[2, 3, 4, 5]

In [0]:
long_names = [name.lower() for name in names[:3] if len(name) > 5]
long_names

['claudia']

In [0]:
# ADVANCED: List Comprehension with dictionary value lookup
studiers = [ key for key in majors if majors[key].find("Studies") >= 0]
studiers

['Scott', 'Quinn']

### Activity

In the cell below, write a function that loops over a list and returns a new list where all the strings have been replaced with their pig latin translations. 

For example, if your list is `['hello', 5, 'world']` your output should be `['ellohay', 5, 'orldway']`.

Feel free to reuse the pig latinizer you wrote above. You'll also need to think about checking the type of each item in the list.

In [0]:
def pig_latinize_list(items):
    latinized_items = []
    for item in items:
        if (type(item) == str):
            latinized_items.append(pig_latinize(item))
        else:
            latinized_items.append(item)
    return latinized_items

pig_latinize_list(['hello', 5, 'world'])

['ellohay', 5, 'orldway']

## Reading from and writing text to a file 

In [0]:
# Working with comma-separated and similar data files will be covered in a later workshop. 
# It's worthwhile, however, to see how to read and write data or text to a file.
# We'll start with writing some text to a file, then explore how to read it.

sample_text = """
Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
Fusce pharetra tristique iaculis. Morbi maximus interdum nibh, at faucibus lacus porta vitae.
Praesent mi velit, tempus sit amet sagittis a, sodales sit amet sapien.
Praesent dictum, diam a hendrerit cursus, eros dolor posuere sem, a porttitor libero nulla molestie eros.
"""
with open('lorem.txt', 'w') as f:
    f.write(sample_text)


# You can check the "Files" tab in the column at left now to find the output file
# Note that you must click the "REFRESH" button to see it.


In [0]:
# Now, let's read the file back.
with open('lorem.txt', 'r') as f:
    print(f.read())


Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
Fusce pharetra tristique iaculis. Morbi maximus interdum nibh, at faucibus lacus porta vitae.
Praesent mi velit, tempus sit amet sagittis a, sodales sit amet sapien.
Praesent dictum, diam a hendrerit cursus, eros dolor posuere sem, a porttitor libero nulla molestie eros.



In [0]:
# We can also read the file line by line
with open('lorem.txt', 'r') as f:
    for line in f:
        #line = line.strip()
        print(line)



Lorem ipsum dolor sit amet, consectetur adipiscing elit. 

Fusce pharetra tristique iaculis. Morbi maximus interdum nibh, at faucibus lacus porta vitae.

Praesent mi velit, tempus sit amet sagittis a, sodales sit amet sapien.

Praesent dictum, diam a hendrerit cursus, eros dolor posuere sem, a porttitor libero nulla molestie eros.



### Collaborative Activity

There is a file named sonnet.txt, containing the text of Shakespeare's Sonnet 18, in the [Drive folder](https://drive.google.com/open?id=1T0CHayyfEXSsspeF854FIAz4W00iJlpv) you accessed when obtaining a local copy of this notebook. Our goal is to read that file in, pig latinize the sonnet, and write a new file containing the pig latinized sonnet.

Some work is needed to get the file into the notebook's local (temporary) storage. You'll need to open the link above in a new browser tab,  right-click the "sonnet.txt" file to download it, then click the "UPLOAD" button in the "Files" list in the pane to the left of this notebook to open a dialog that will let you select and then upload the file.

Hint: you may want to remove spaces from the beginning and end of each line with .strip()

In [0]:
# Pig latinize the sonnet
with open('sonnet.txt', 'r') as f:
    with open('latin_sonnet.txt', 'w') as g:
        lines = [line.strip() for line in f]
        for line in lines:
            words = line.split()
            latin_words = [pig_latinize(word) for word in words]
            latin_sent = ' '.join(latin_words)
            g.write(latin_sent + '\n')
# Notice when you look at this new file that the punctuation remained **within** the latinized words
# since we didn't account for it

Click the "REFRESH" button to update the "Files" list at left to display any new files you've created.

## Further resources and topics

### Resources
- https://python.swaroopch.com/ (A Byte of Python is a great intro book and reference for Python)
- https://docs.python.org/3/ (Official Python documentation and tutorials)
- https://realpython.com/ (Contains a lot of different tutorials at different levels)
- https://www.lynda.com/Python-training-tutorials/415-0.html (Lynda is free with Stanford accounts. I haven't used these tutorials but have used Lynda for other programming languages and been quite happy with it)

### Topics
- Other data structures: sets, tuples
- Libraries, packages, and pip
- Virtual environments
- Text editors and local execution environments
- The object-oriented paradigm in Python: classes, methods