# Part 1:  Programming Concepts

# Installation

We're going to be installing several things that will give you a standard Python development envirionment, but use applications on top of it that are suitable for beginners.  This means you're set for exploration and experimentation, but you'll have something that will be appropriate if you continue on.

## Step 1: Download and install this stuff

* Download and install the Python 3.6 graphical installer of Anaconda from here: https://www.continuum.io/downloads
* Download and install the PyCharm Education Edition:  https://www.jetbrains.com/pycharm-edu/download

The Anaconda package will install Python and other helper packages.  It'll also place in some hooks/aliases in your computer to make working with Python easier (particularly for Windows users).  You will not directly interact with Anaconda.  You will use PyCharm to write and execute your scripts, but have it hooked to your Anaconda installation.  We'll walk you through this.

## Step 2: Test Anaconda

After Anaconda has finished installing, restart your computer.  Now we’ll test that everything worked.  You’ll be using anaconda the entire class, but it’ll be somewhat invisible to you.

**Open your command line**

* Windows users:  open up your Command Prompt application (in your system search bar, type in ‘cmd’ and an application will open up.  Youtube has videos explaining this more if you need it.

* Mac users:  open your Terminal application. Either search via spotlight for ‘terminal’ or open it from Applications -> Utilities -> Terminal.  

**Attempt to launch Python**

Within your command line application, type in ‘python’ (without quotes) and press return.  
* Windows users:  you should not get an error that it doesn’t know what Python is.
* Mac users:  it should state it is Python 3.6 and not Python 2.7.

Put up your help flag and halt progress here if you hit either error condition.  Otherwise, just close the window after this.  Everything has worked and you can move on to installing PyCharm Education edition.

**Open PyCharm Education edition**

* When you open it for the first time it’ll ask you what sort of project you want.  Select Create New Project.
* On the next screen you’ll want change a few things:
    1.	Select the Location where you want your project folder to exist on your computer.  I suggest putting it in your Documents folder.
    2.	Change Untitled to Humanities Data (or some other name of choice).
    3.	For interpreter, this is where it gets weird.  
        * Windows users, the drop down should show something like C:\Anaconda\python.exe.  Which you should select.  
        * For mac users, you might see several.  You want the one that has anaconda in the file path.  E.g. ~/anaconda/bin/python.  Choose the one that has Python 3.6.x _Do not choose one that states Python 2.6 or 2.7_. You may need to click the … button and manually add where your ~/anaconda/bin/python exists.  

**Make a Python script in PyCharm and try to run it**

Don’t worry; you don’t need to know python yet. We’re just testing to see if everything is working.

1. Right click on the name of your project on the left.  Select New -> Python file
2. Name is something like ‘testing’
3. You’ll now see an empty file in the main part of the PyCharm window
4. Type in the following into that screen: `print("Hello world")`
5. A green "play" arrow should appear next to line 1 next to that text.  It may take a moment or two for the green button to appear.  That is completely normal and just PyCharm connecting to your anaconda installation.
6. Click that green button!
7. A new panel at the bottom of the program should appear, with text that starts with "Hello world". "Process finished with exit code 0" means that everything worked and is not something to worry about.  Looks weird, though, I know.

**Mac Users:  test to be sure you've got the right version of Python installed**

Sometimes it's possible to get PyCharm going on a mac with 2.7 rather than 3.6.  This is a simple test:

1. Go back into your testing.py file that you just used.
2. Change your line of test to `print 1 + 1`.
3. Attempt to run it.
4. In the bottom screen where the output is, you should either get:
    * An error that says 

```python
File "<ipython-input-1-28aa8a406b63>", line 1
    print 1 + 1
          ^
SyntaxError: Missing parentheses in call to 'print'```

**The error means that you have hooked PyCharm to the right version of Python and you're done!**

**If you got the number 2, then you need to go back to the step where you are connecting Anaconda to the right interpreter.**

# Programming languages versus software

We're used to opening up software for specific tasks.  We work inside of highly stylized digital files that remove many traces of the digital-ness of these files and the computers. This makes sense.  We just want to get our jobs done.

Software programs are all written by humans via programming languages.  Software packages are usually written to be a tool to solve a need or serve a purpose.  You can't usually hack something to do anything you want (although sometimes things are flexible enough that you can get away with it).  

Programming languages are similar in this way, they are tools written out of other programming languages to solve a need or serve a purpose.  Some languages are designed for very specialized tasks, while others are more general purpose.  These languages are also written languages, so they each have their own linguistic style.


## Programming languages can be classified into groups

_Very_ roughly, programming languages can be described by their purpose and their style.

Purpose:
* Web languages (HTML, PHP, JavaScript, etc...)
* Scripting (Python, Perl, R, etc...)
* Software languages (Java, C++, etc...)

Style:
* Language syntax
* Abstraction level
* Methodological approach
* etc...


## That's neat, but what about Python?

In this description universe, we can describe Python as:
* A general purpose programming language, most commonly used for scripting
* But with growing web and software design usage
* A very high level language designed for clarity of expression

![title](grid.png)


## Points of interest in the Python biography

* Created by Guido van Rossum
    * Who is now the Benevolent Dictator for Life (BDFL)
* Is open source
    * Community maintained
    * Features created by its users
    * Adapts quickly to industry needs
    * Free to use
    * Consensus based development
* Written the C language
    * Everything written in Python is translated to C before being executed.
    * This does make execution technically slower than in C, but this is rarely a problem unless you're doing computationally intensive scientific computing.
* Was designed with a philosophy
    * And this design philosophy is maintained by the community, with Guido's dictatorship.


In [1]:
# you can run this in PyCharm

import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


In [25]:
# print(u'\U0001F4D6')

print(len("Elizabeth") * u'\U0001F426')



🐦🐦🐦🐦🐦🐦🐦🐦🐦


## Python and why we are learning it

## How the history of computing impacts programming languages

![title](US-ASCII_code_chart.png)

* Everything is numbers
* Some of those numbers are displayed as letters.
    * Those letters become words, and somtimes collected as documents.
* Some of those numbers become the values for pixel colors.
    * Those pixels become images, and some of those images are part of video.
* Some of those numbers represent frequencies and sounds.
    * Those sounds become audio recordings.
* Meanwhile, computers need to have more messages than just letters.  Before we had the computers we understand as them now, there were Teletype machines.  These machines neded commands that controlled the physical printing process, so there are a series of [control characters](https://en.wikipedia.org/wiki/Control_character) that direct the machine how to work.
* These control characters still exist and some are still used and impact processing. For the curious, you can read abotu the history of the newline character: https://en.wikipedia.org/wiki/Newline.  
   
![title](Teletype-IMG_7287.jpg)

In [56]:
with open('raven.txt', 'r') as file_in:
    poem = file_in.read()
    
stanzas = poem.split("\n\n")

raven_count = [s.count("Raven") + s.count("bird") + s.count("fowl") for s in stanzas]
lenore_count = [s.count("Lenore") for s in stanzas]
chamber_count = [s.count("chamber") for s in stanzas]
door_count= [s.count("door") for s in stanzas]

print("Counts of 'bird'/'Raven'/'fowl' as \U0001F426, Lenore as \U0001F452,\n"
      "'chamber' as \U0001F3E0, and door as \U0001F6AA in stanzas of The Raven")
print()
for stanza_num, bird_count in enumerate(raven_count):
    print(str(stanza_num).zfill(2) + ": ", end = "") 
    print(u'\U0001F3E0' * chamber_count[stanza_num], end = "")
    print(u'\U0001F6AA' * door_count[stanza_num], end = "")
    print(u'\U0001F426' * bird_count, end = "")
    print(u'\U0001F452' * lenore_count[stanza_num])

Counts of 'bird'/'Raven'/'fowl' as 🐦, Lenore as 👒,
'chamber' as 🏠, and door as 🚪 in stanzas of The Raven

00: 🏠🏠🚪🚪
01: 👒👒
02: 🏠🏠🚪🚪
03: 🏠🚪🚪
04: 👒👒
05: 🏠
06: 🏠🏠🚪🚪🐦
07: 🐦🐦🐦
08: 🏠🏠🚪🚪🐦🐦
09: 🐦🐦
10: 
11: 🚪🐦🐦🐦🐦
12: 🐦
13: 🐦👒👒
14: 🐦🐦
15: 🐦🐦👒👒
16: 🚪🚪🐦🐦
17: 🏠🚪🐦


# Data Types

There parts of speech in a Python expression.  Informally and generally, these are:

* Nouns -> objects
* Verbs -> functions and operators
* Prepositions/Conjunctions -> keywords that provide more structure and direction

You use combinations of prepositions/conjunctions to form a structure that contains verbs and actions that operate on objects. (Don't angry tweet me because you disagree with this, we're talking very generally here, not scientific classifications).

Just like with writing, there are types of programs.  Essays and poems have very different structures even though they might be written using the same language.  Writing programs is no different.  Sometimes these programs require a lot of introduction and prepapration, other times they need a lot of definitions and rule creation, and other times you can just bust in and write a few powerpoint slides.

Here's a crappy poem that describes the Raven program:

* Read in the raven
* Separate the stanza text
* Count the words I want


* Print the chart title
* Looping over the stanzas
* Print the emoji


* Joy


But you may have a more complex problem where you need to more systematically approach it and tease some initial things out:

* Given a set of documents, where we can test that there indeed is one chapter per file, and all files are stripped of header and tail information, we can then begin to look at the content.
* The content is then a series of lines of text, which should be broken down into consituant words.  However, the relationship between these words and the lines they came from needs to be retained in some fasion such that the lines can be stitched back together after processing each word.

The data types are what the bits and pieces that we work with, and and manipulated to craft the narrative of accomplishments within these descriptions.

## Numbers

Computers represent and store numbers differently than humans. Languages like Java actually have five different numerical data types. Python (at least in everyday use) only requires programmers to deal with two numerical datatypes. Why would we need more than one way to store numbers in the first place?

This has to do with the legacy of early computing and how computers store data in memory. The key fact is that storing some kinds of numbers takes up less memory than other types of numbers. With 64 units of memory space, practically any number can be stored. However, not all numbers need that much space. The vast majority of numbers only need 16 units of memory space.

Recall our earlier discussion about how early computer languages were closer to the computer's internal structure. When memory was very limited and expensive, allowing a number to take up more memory than necessary could render a project impossible from an engineering or budgetary perspective. Programmers gained a real benefit from the variety of numerical data types that allowed them fine control over how much memory to allocate for each number they needed to store.

Compared to the early days of computing, the amount of memory available in today's computers is incredibly huge and cheap (though not infinite!) Programmers can afford to be more relaxed when it comes to managing memory. Python is also better than early languages at automatically deciding how much memory is needed to store a given number and so takes care of much of this in the background for us. However, there are still two different numerical data types that we need to remember:

* integers (whole numbers)
* floating point number (decimal numbers)


### Integers:  Whole numbers

The `int` type, short for integer, is used for whole numbers, numbers without any decimal points.  

In [57]:
print(1 + 1)

2


In [58]:
type(1)

int

In [59]:
type(1 + 1)

int

### Floating point numbers:  Numbers with decimal values

The `float` type, short for floating point number, is any number with a decimal value.

In [61]:
print(1.5 + 2.5)

4.0


In [62]:
print(1.5 + .3)

1.8


In [63]:
print(type(1.5))

<class 'float'>


In [65]:
print(type(10/2))

<class 'float'>


### Interaction of `int` and `float`

`int` and `float` numbers get along pretty well, but there are some aspects which you might find surprising. We'll go into more detail about this later.  (You haven't met some of these operators yet, but you can understand the basics!)

In [66]:
print(1 + 0.4) # int + float == float

1.4


In [67]:
print(4 - .1) # int - float == float

3.9


In [68]:
print(type(1.0)) # whole numbers can be represnted as floats as well

<class 'float'>


In [69]:
print(1 == 1.0) # different data type, but equal the same value

True


## Words

Words are pretty simple here.  All the basic characters you see, the ones you can see and type on your keyboard, are represented as characters.  These characters can be strung together as strings! (pun intended)

### Strings

The `str` type, short for string, contains character data.  There are three ways to indicate that something is a string.


`"like this, with double quotes"`

`'or like this, with single quotes'`

`""" or use`<br>
`three double quotes`<br>
`together to make`<br>
`a multi-line string"""`

Feel free to use either ' or ", but try to be consistent.  The multi-line string is generally used either for certain types of documentation modes.

### Why do we need three different ways to write strings?

Why do you not want choice, human?

In [78]:
print("Feel free to use either single or double quotes")

print('You just need to make them match.')

# BUT I NEED THAT SWEET SWEET DOUBLE QUOTE INSIDE OF DOUBLE QUOTES
# you're covered

print("You can escape it out, like this: \" see the backslash?")

# and now I have a long chunk of text that I don't want to parse

print("""I am a meat popsicle.
I live to be eaten.
My soul is frozen.
There is a stick.""")

Feel free to use either single or double quotes
You just need to make them match.
You can escape it out, like this: " see the backslash?
I am a meat popsicle.
I live to be eaten.
My soul is frozen.
There is a stick.


# What can you do with the essentials?

Now that we've talked about the two essential data types for storing content (numbers and words), we'll start exploring what you can do with these essential data types.

# Numerical operators

Numbers are the core of computing, so naturally there are a ton of things you can do with them.  

### The basic operators


| Operator |                       Numeric Operation                       | Example |
|:--------:|:-------------------------------------------------------------:|:----------------:|
| +        | Addition                                                      | `>>> 2 + 5` <br> `7` | 
| -        | Subtraction                                                   | `>>> 2 - 5` <br> `-3` | 
| *        | Multiplication                                                | `>>> 2 * 5` <br> `10` | 
| **       | Exponent                                                      | `>>> 2 ** 5` <br> `32` | 
| /        | division <br>Python2.\*: floor division<br>Python 3.\*: true division | `>>> 2 / 5` <br> `0`<br>`>>> 2. / 5.` <br> `0.4` | 
| //       | floor division<br>both 2.\* and 3.\*                            | `>>> 2 // 5` <br> `0`<br>`>>> 2. // 5.` <br> `0.0` | 
| %        | modulo<br>(returns the remainder)                                | `>>> 2 % 5` <br> `2`<br>`>>> 4 % 2` <br> `0` | 

### Modulo?

Referred to as 'mod', modulo returns the remainder of your division value.

Remember the parts of a division calculation:

* Quotient
    * The 'whole number' value
* Remainder
    * What is 'left over'
    
What is returned from modulo is not the decimal value!

Modulo is most valuable for checking if a number is divisible by another number (e.g. checking if a number is even or odd).

In [86]:
print(5. / 2.)

2.5


In [87]:
print(5 % 2)

1


## String operations


Strings can be sliced, meaning that subsections can be accessed.  The slicing notation is:

`string[start:stop:step]`

* `start` is the index position to start at
    * Assumes start if omitted
* `stop` is the index position to stop at
    * Assumes end if omitted
* `step` is how many positions to move over
    * Assumes 1 if omitted
    
Let's open up your interactive session and test this out.

Most of the cleverness you'll find in Python relate to this start:stop:step notation.

In [77]:
print("hello"[:]) # everything
print("hello"[0:len("hello")]) # also everything
print("hello"[::2]) # skip every other letter
print("hello"[0:2]) # yup, indexing starts at 0

hello
hello
hlo
he


You try a few on your own:

* get just the l's
* skip every three letters

Anything else neat?

### Numerical-looking operations on strings

Some of the numerical operators are overloaded to work on strings and other data types in Python.

Strings can also be:

* Added via `+`
    * `'a' + 'b' == 'ab'`
* Multiplied by integers for repetition
    * `'hello' * 4 == 'hellohellohellohello'`
* Or combined!

In [88]:
print("oh " + len("screw") * "*" + " this " + len("crap") * "*")

oh ***** this ****


Can you create a concatenation of the first and last letters of `"hello"`?

## One more quick data type: the boolean

## Conditional testing

The bulk of your creative design in programming will be making conditional checks on content.  These are boolean tests, or logical tests.  There are only two values things can be:

* `True`
* `False`

We use the `==` operator to initiate one of these checks.

Some examples:

* Is this a title that I have seen already?
* Was this word found more than 100 times?
* Is the length of the word longer than 10 characters?

There are a variety of ways to construct these tests, but the goal is to produce a boolean answer.

In [79]:
print(1 == 1)

True

In [80]:
print(5 < 10)

True

In [81]:
print(len("foo") > 5) # len() checks the length of whatever you pass it

False

In [82]:
'a' in 'abcdefg'

True

In [84]:
print('i' in 'team')

False


# Variables

Now that we're making more and bigger stuff, we need to store those things with memorable (hopefully?) names.  This allows us to reference things where the content might change, or avoid having to type the same info multiple times.

We use a single `=` sign to create assignments. The basic syntax is `variable = value`, so to the left of the `=` is the variable name, and the stuff on the right is the value you want to store with that name.

In [92]:
x = 4
name = "Elizabeth"

In [99]:
print("Hello " * x, name + "!")

Hello Hello Hello Hello  Elizabeth!


# The most basic container: the list

Lists are:

* ordered
* flat hierarchy 
* present in nearly every program you'll ever write
* designated by square brackets:  `[]`

In [90]:
colors = ['red', 'yellow', 'green', 'blue']
nouns = ['cat', 'dog', 'shark']

# Containers in context

Can we automatically generate madlibs using this base?

* A list of many colors to choose from
* a number
* A list of several nouns to choose from

## Pair activity:  constructing madlibs

1. Pair up with someone near you!
2. One you constructs a list with:
    * an integer number
    * a color
    * a animal name
3. The other constructs a sentence that uses a number, a color, and an animal name.

Use your knowledge of list index positions and string concatenation to write a madlib sentence that utilizes those values.  Knowing that the item at position `0` is the number, `1` is the color, and `2` is the animal name. TIP!  You'll need to recast the number to a string via `str()`.

![title](singleton_party_parrot.png)
![title](parrot_2.png)
![title](parrot_3.png)
![title](parrot_4.png)

In [108]:
my_list = [2, 'red', 'gecko']

print("I'd have to work for " + str(my_list[0]) + " years to save for a " + my_list[2] + " with " + my_list[1] + " eyes.")

I'd have to work for 2 years to save for a gecko with red eyes.


# Functions

Funtions are structures to modularize specific sections of code.  These are most useful when you are doing a task that is pretty repetitive, meaning that you can call a function repeatedly rather than having to write out all that code over and over.

We've been working with functions this whole time, for example `type()` and `print()`.  We call them by using their names, and sometimes we pass values to them.  Example:  in `type(1)` the function is `type` and you're passing it an object of `1`.  `type` then returns a value to us that we can store or print out.  

## Function vocabulary

* function
    * a piece of code that has been compartmentalized and named
* name
    * not really a vocabulary piece, but functions all have names, e.g. 'type' is the name of `type()`
* call
    * when you want to make a function do its thing, you have to call it by name.  Otherwise it just sits there ready to go but sitting there dormant until executed.  I call `type()` but executing the code `type(1)`.
* pass values
    * functions are little semi-independent buckets that sit in a corner and wait to do their work.  They often need to act on variables and values but because they are usually independent of when/how they're used in a larger context, they won't always know the variable names in question.  So when you stick values or variables inside of the `()` when you call a function, you're passing that function a little bundle of data to crunch on.
* return something
    * We're used to printing out values to our screen, but we sometimes we need to capture those values for reuse.  So when I print `2 + 2` I see `4` on my screen, but what if I need to use that calculated value later on?  This is where `return` comes into play.  
    
## Function scope

Very generally, functions can 'see' variables outside of them, but variables inside a function cannot be 'seen' by any other portion of the program.  This is why we have to `return` values back out of a function.

## Defining functions

When defining a function, that code is both executed and not excuted.  Sort of an interstitial state.  When a function is defined, that code belonging to it is **learned** but not actually run.  The code within a function is only run when the function itself is called.  This means you can have errors in your code within a function but your script will run just fine unless you actually call that function.

## Stop talking, how do I do it?

The basic syntax of making functions is:

```python
def function_name(parameter_variables):
    return "the stuff you want to do"
```

Let's adapt this for our madlibs.

In [4]:
def my_mad_lib(number, color, animal):
    result = "I'd have to work for " + str(number) + \
             " years to save for a " + animal + " with " + color + " eyes."
    return result

In [5]:
print(my_mad_lib(2, "red", "gecko"))

I'd have to work for 2 years to save for a gecko with red eyes.


# Pair activity!

Within your pairs, each of you independently come up with a new madlib function. It should ask for certain parameters and print out a new creation.  Without sharing what the madlib is, ask eachother for the appropriate inputs.  Test and share!

# Lunch break