## Python Fundamentals

In [None]:
Variables, evaluations, conditionals, Data types, Strings

In [None]:
number_of_tweets_per_hour

## Example Word Count Python Code

In [1]:
import re
from collections import Counter
from nltk.corpus import stopwords

def split_into_words(any_chunk_of_text):
    lowercase_text = any_chunk_of_text.lower()
    split_words = re.split("\W+", lowercase_text)
    return split_words

filepath_of_text = "../texts/The-Yellow-Wallpaper.txt"
nltk_stop_words = stopwords.words("english")

with open(filepath_of_text, encoding="utf-8") as file_object:
    full_text = file_object.read()

all_the_words = split_into_words(full_text)
meaningful_words = [word for word in all_the_words if word not in nltk_stop_words]
count_of_meaningful_words = Counter(meaningful_words)
most_frequent_meaningful_words = count_of_meaningful_words.most_common(40)

print(most_frequent_meaningful_words)

[('john', 45), ('one', 33), ('said', 30), ('would', 27), ('get', 24), ('see', 24), ('room', 24), ('pattern', 24), ('paper', 23), ('like', 21), ('little', 20), ('much', 16), ('good', 16), ('think', 16), ('well', 15), ('know', 15), ('go', 15), ('really', 14), ('thing', 14), ('wallpaper', 13), ('night', 13), ('long', 12), ('course', 12), ('things', 12), ('take', 12), ('always', 12), ('could', 12), ('jennie', 12), ('great', 11), ('says', 11), ('feel', 11), ('even', 11), ('used', 11), ('dear', 11), ('time', 11), ('enough', 11), ('away', 11), ('want', 11), ('never', 10), ('must', 10)]


## Variables

Variables are one of the fundamental building blocks of Python. A variable is like a tiny container where you store information that you're going to use later on --- filenames, words, numbers, collections of words and numbers, etc. When you put information into a variable, it's called "assigning" a variable. You assign variables with an equals `=` sign. 
> The "real" Python equals notation is two equals signs side-by-side `==`, e.g. `2 * 2 == 4`

Let's look at the variables that we used when we counted the most frequent words in Charlotte Perkins-Gilman's "The Yellow Wallpaper."

In [2]:
filepath_of_text = "../texts/The-Yellow-Wallpaper.txt"

In the above cell, we assigned the filepath of our "The Yellow Wallpaper" text file ("../texts/The-Yellow-Wallpaper.txt") to the variable `filepath_of_text`. We can check to see what's "inside" the variable by running a cell with the variable's name.

> Outside the Jupyter environment, you would need to run `print(filepath_of_text)` to display the variable. You can run the `print` function in Jupyter, too. But you don't need to so, which is convenient.

In [5]:
filepath_of_text

'../texts/The-Yellow-Wallpaper.txt'

In [6]:
print(filepath_of_text)

../texts/The-Yellow-Wallpaper.txt


We also assigned our list of stopwords to a variable called `nltk_stop_words`.

In [3]:
nltk_stop_words = stopwords.words("english")

In [4]:
nltk_stop_words

['i',
 'me',
 'my',
 'myself',
 'we',
 'our',
 'ours',
 'ourselves',
 'you',
 "you're",
 "you've",
 "you'll",
 "you'd",
 'your',
 'yours',
 'yourself',
 'yourselves',
 'he',
 'him',
 'his',
 'himself',
 'she',
 "she's",
 'her',
 'hers',
 'herself',
 'it',
 "it's",
 'its',
 'itself',
 'they',
 'them',
 'their',
 'theirs',
 'themselves',
 'what',
 'which',
 'who',
 'whom',
 'this',
 'that',
 "that'll",
 'these',
 'those',
 'am',
 'is',
 'are',
 'was',
 'were',
 'be',
 'been',
 'being',
 'have',
 'has',
 'had',
 'having',
 'do',
 'does',
 'did',
 'doing',
 'a',
 'an',
 'the',
 'and',
 'but',
 'if',
 'or',
 'because',
 'as',
 'until',
 'while',
 'of',
 'at',
 'by',
 'for',
 'with',
 'about',
 'against',
 'between',
 'into',
 'through',
 'during',
 'before',
 'after',
 'above',
 'below',
 'to',
 'from',
 'up',
 'down',
 'in',
 'out',
 'on',
 'off',
 'over',
 'under',
 'again',
 'further',
 'then',
 'once',
 'here',
 'there',
 'when',
 'where',
 'why',
 'how',
 'all',
 'any',
 'both',
 'each

Our stopwords variable shows how useful variables can be. Rather than writing out all 179 stopwords in this list, we put them all into a variable and used them there.

Same goes for the variable `all_the_words`, where we stored all 6,000+ words from the short story.

In [7]:
all_the_words = split_into_words(full_text)


In [8]:
all_the_words

['the',
 'yellow',
 'wallpaper',
 'by',
 'charlotte',
 'perkins',
 'gilman',
 'it',
 'is',
 'very',
 'seldom',
 'that',
 'mere',
 'ordinary',
 'people',
 'like',
 'john',
 'and',
 'myself',
 'secure',
 'ancestral',
 'halls',
 'for',
 'the',
 'summer',
 'a',
 'colonial',
 'mansion',
 'a',
 'hereditary',
 'estate',
 'i',
 'would',
 'say',
 'a',
 'haunted',
 'house',
 'and',
 'reach',
 'the',
 'height',
 'of',
 'romantic',
 'felicity',
 'but',
 'that',
 'would',
 'be',
 'asking',
 'too',
 'much',
 'of',
 'fate',
 'still',
 'i',
 'will',
 'proudly',
 'declare',
 'that',
 'there',
 'is',
 'something',
 'queer',
 'about',
 'it',
 'else',
 'why',
 'should',
 'it',
 'be',
 'let',
 'so',
 'cheaply',
 'and',
 'why',
 'have',
 'stood',
 'so',
 'long',
 'untenanted',
 'john',
 'laughs',
 'at',
 'me',
 'of',
 'course',
 'but',
 'one',
 'expects',
 'that',
 'in',
 'marriage',
 'john',
 'is',
 'practical',
 'in',
 'the',
 'extreme',
 'he',
 'has',
 'no',
 'patience',
 'with',
 'faith',
 'an',
 'inten

## Variable Names

Though we named our variables `filepath_of_text`, `nltk_stop_words`, and `all_the_words`, we could have named them almost anything else.

Variable names can be as long or as short as you want, and they can include:
- upper or lower-case letters (A-Z)
- digits (0-9)
- underscores (_)

Instead of `filepath_of_text`, we could have simply named the variable `filepath`.

In [12]:
filepath = "../texts/The-Yellow-Wallpaper.txt"

In [13]:
filepath

'../texts/The-Yellow-Wallpaper.txt'

Or we could have gone even simpler and named the filepath `f`.

In [13]:
f = "../texts/The-Yellow-Wallpaper.txt"

In [14]:
f

'../texts/The-Yellow-Wallpaper.txt'

As you start to code, you will almost certainly be tempted to use extremely short variables names like `f`. Your fingers will get tired. Your coffee will wear off. You will see other people using variables like `f`. And you'll promise yourself that you'll definitely remember what `f` means.

But I must urge you: *resist the temptation of bad variable names at all costs!!*

Clear and precisely-named variables will

1. Make your code more readable (both to yourself and others)
2. Reinforce your understanding of Python and what's happening in the code
3. Sharpen your thinking

This principle applies to everyone, but I think readable code is *especially* important for students and scholars in the humanities, because we're more used to reading and writing with words. Programming languages are also still quite new and relatively more unfamiliar to humanities folks. When we write clear code, we help build bridges among different communities of people.

### Example Python Code ❌ **With Bad Variable Names** ❌

For the sake of illustration, here's our same word count Python code with poorly named variables. This code works exactly the same and still gives us the 40 most frequently occurring words, but it's *much* harder to read. Not ideal.

In [15]:
import re
from collections import Counter
from nltk.corpus import stopwords

def sp(t):
    lt = t.lower()
    sw = re.split("\W+", lt)
    return sw

f = "../texts/The-Yellow-Wallpaper.txt"
st = stopwords.words("english")

with open(f, encoding="utf-8") as fo:
    ft = fo.read()

words = sp(ft)
words = [w for w in words if w not in st]
words = Counter(words)
words = words.most_common(40)

print(words)

[('john', 45), ('one', 33), ('said', 30), ('would', 27), ('get', 24), ('see', 24), ('room', 24), ('pattern', 24), ('paper', 23), ('like', 21), ('little', 20), ('much', 16), ('good', 16), ('think', 16), ('well', 15), ('know', 15), ('go', 15), ('really', 14), ('thing', 14), ('wallpaper', 13), ('night', 13), ('long', 12), ('course', 12), ('things', 12), ('take', 12), ('always', 12), ('could', 12), ('jennie', 12), ('great', 11), ('says', 11), ('feel', 11), ('even', 11), ('used', 11), ('dear', 11), ('time', 11), ('enough', 11), ('away', 11), ('want', 11), ('never', 10), ('must', 10)]


### Example Python Code ✨ **With Good Variable Names** ✨

In [17]:
import re
from collections import Counter
from nltk.corpus import stopwords

def split_into_words(any_chunk_of_text):
    lowercase_text = any_chunk_of_text.lower()
    split_words = re.split("\W+", lowercase_text)
    return split_words

filepath_of_text = "../texts/The-Yellow-Wallpaper.txt"
nltk_stop_words = stopwords.words("english")

with open(filepath_of_text, encoding="utf-8") as file_object:
    full_text = file_object.read()

all_the_words = split_into_words(full_text)
meaningful_words = [word for word in all_the_words if word not in nltk_stop_words]
count_of_meaningful_words = Counter(meaningful_words)
most_frequent_meaningful_words = count_of_meaningful_words.most_common(40)

print(most_frequent_meaningful_words)

[('john', 45), ('one', 33), ('said', 30), ('would', 27), ('get', 24), ('see', 24), ('room', 24), ('pattern', 24), ('paper', 23), ('like', 21), ('little', 20), ('much', 16), ('good', 16), ('think', 16), ('well', 15), ('know', 15), ('go', 15), ('really', 14), ('thing', 14), ('wallpaper', 13), ('night', 13), ('long', 12), ('course', 12), ('things', 12), ('take', 12), ('always', 12), ('could', 12), ('jennie', 12), ('great', 11), ('says', 11), ('feel', 11), ('even', 11), ('used', 11), ('dear', 11), ('time', 11), ('enough', 11), ('away', 11), ('want', 11), ('never', 10), ('must', 10)]


### Off-Limits Names

The only variable names that are off-limits are names that are reserved by (or built into) the Python programming language itself, such as `print`, `True`, or `list`. But it's not something to worry too much about. You'll know if a name is reserved when it shows up in green.

In [10]:
True = "../texts/The-Yellow-Wallpaper.txt"

SyntaxError: can't assign to keyword (<ipython-input-10-fbaebf398d20>, line 1)

In [8]:
nltk_stop_words = stopwords.words("english")

NameError: name 'stopwords' is not defined

In [None]:
all_words = split_into_words(full_text)

In [None]:
full_text = ''

In [None]:
meme_text = ''

In [19]:
number_of_tweets_per_hour = 3

In [20]:
number_of_tweets_per_hour

3

In [21]:
print(number_of_tweets_per_hour)

3


In [22]:
number_of_tweets_per_hour = 15

In [23]:
number_of_tweets_per_hour

15

In [27]:
number_of_tweets_per_hour = input("How many tweet hours?")

How many tweet hours? 12


"Greetings! This is a Buzzfeed-style quiz that will tell you where you'd like to live based on your favorite author.

Please enter your favorite author from the following: James Joyce, Virginia Woolf, Ernest Hemingway, J.K. Rowling, Stephen King, Stephenie Meyer."

In [None]:
quiz_greeting_prompt = "✨Greetings!✨This is a Buzzfeed-style quiz that will tell you where you should take a vacation based on your favorite author. Please enter your favorite author from the following:\nJames Joyce, Virginia Woolf, Ernest Hemingway, J.K. Rowling, Stephen King, Stephenie Meyer.\n"

In [71]:
favorite_author = input(quiz_greeting_prompt)

if favorite_author == 'Stephen King':
    print("✨You'd probably like to live in: a creepy hotel✨")
    
elif favorite_author == 'Virginia Woolf':
    print("✨You'd probably like to live in: a room of your own in London✨")
    
elif favorite_author == 'James Baldwin':
    print("✨You'd probably like to live in: a neighborhood with good jazz in NYC✨")
    
elif favorite_author == 'Stephenie Meyer':
    print("✨You'd probably like to live in: a rainy town in Washington✨")
else:
    print("I don't know her")

✨Greetings! This is a Buzzfeed-style quiz that will tell you where you should take a vacation based on your favorite author. Please enter your favorite author from the following:
James Joyce, Virginia Woolf, Ernest Hemingway, J.K. Rowling, Stephen King, Stephenie Meyer.
 James Baldwin


✨You'd probably like to live in: a neighborhood with good jazz in NYC✨


In [67]:
!python Buzzfeed_Author_Quiz.py

In [35]:
favorite_author

'Stephen King'

In [36]:
if favorite_author == 'Stephen King':
    print("A creepy hotel")

A creepy hotel


In [56]:
favorite_author = input("What's your age?")

if int(favorite_author) >  16:
    print("You'd probably like to live in: a creepy hotel")
    
if favorite_author == 'Virginia Woolf':
    print("You'd probably like to live in: a room of your own in London")
    
if favorite_author == 'James Joyce':
    print("You'd probably like to live in: a pub in Dublin")
    
if favorite_author == 'Stephenie Meyer':
    print("You'd probably like to live in: a rainy town in Washington with mysterious new neighbors")

else:
    print("I don't know her")

What's your age? 15


I don't know her


## Evaluations

In [None]:
Collections - Lists, Dictionaries

For Loops and List Comprehensions 

Reading and Writing Files

In [24]:
Libraries

NameError: name 'Libraries' is not defined