## Getting started

Before we begin, let's cover a few basics about Jupyter (also known as iPython) notebooks. To run the cell, and go to the next cell, press Shift+Enter. If you want to run the cell without advancing to the next, press Ctrl+Enter.

 IPython is the interactive control panel 

### Variable assignment, basic calculations, and data types

In [0]:
## CODE CELL 1
# This is a comment. Comments will not appear in the output when a cell is run.

a = 45    # assigning the value 45 to the letter "a"
print(a)

As you may have guessed, the `print()` function displays the value of the argument that's passed to it (e.g. whatever is inside the parentheses).

In [0]:
## CODE CELL 2
# Let's assign a couple more variables 
b = 12
c = a + b
print(c)

In [0]:
## CODE CELL 3
# Let's increment "c" by 2
## you can add multiples commas and different type of datatype and print function will work

print('c =', c)
c = c + 2
print('c + 2 =', c)

Try running the previous cell again. What happens?

In [0]:
## CODE CELL 4
# Another useful method to increment

print('c =', c)
c += 2
print('Now c =', c)

Decrementing works similarly (the operator is `-=`)
Other useful operations:

- subtraction: `a - b`
- multiplication: `a * b`
- division: `a / b`
- floor division (the integer part, or quotient, of a division operation): `a // b`
- modulo (remainder): `a % b`

In [0]:
## CODE CELL 5
# Some of the above in action
print('a', a)
print('b', b)
div = a/b
print('a / b is', div)
floor = a//b
##floor function is used to return the closest integer value which is less than or equal to the specified expression or Value.
print('a // b is', floor)
mod = a%b
print('a % b is', mod)

So far, you've seen three data types: strings, integers, and floats. We can use the function `type()` to find out what data type a value or variable represents.

In [0]:
## CODE CELL 6

type(3.75)      # this is a float
#A float is a floating-point number, which means it is a number that has a decimal place. 

In [0]:
## CODE CELL 7

type(3)     # this is an integer
# A string is a sequence of characters

In [0]:
## CODE CELL 8

type('This is a string.')      # this is a string

In [0]:
## CODE CELL 9

type("This is also a string.")     # double quotes or single quotes can be used

The `math` module is part of the standard library and has a lot of useful functions. To use it, we need to import it into this notebook.

In [0]:
## CODE CELL 10

import math

math.sqrt(9)

In [0]:
## CODE CELL 11

math.pi

3.141592653589793

To learn more about the various functions belonging to the `math` module, call the `help()` function on it. Alternatively, you can read the online documentation for this module here: https://docs.python.org/3/library/math.html. This applies to any module, class, function, etc. that you may want more information on.

In [0]:
## CODE CELL 12
### explain about specific function in a package

help(math)

To convert a number from a float into an integer, use the function `int()`:

In [0]:
## CODE CELL 13
## casting of integer to string

a = int(6.0)
type(a)

Conversely, you can convert an integer into a float using `float()`:

In [0]:
## CODE CELL 14

float(6)

What if we have a number in string format?

In [0]:
## CODE CELL 15
# This doesn't work

'6.5'+7

In [0]:
## CODE CELL 16
# This works

float('6.5')+7

If we want to convert an integer or float to a string, we can use the function `str()`:

In [0]:
## CODE CELL 17

str(100)

### Strings: cleaning and manipulation

Indexing in Python starts from 0. That means that the first element of any string, list, array, etc. is actually considered to be element # 0.

In [0]:
## CODE CELL 18

myString = 'Rutgers is one of the top 10 oldest colleges in the U.S.'

Accessing characters in the string:

In [0]:
## CODE CELL 19

myString[0]

In [0]:
## CODE CELL 20

myString[1]

You can also access characters with reference to the end of the string:

In [0]:
## CODE CELL 21

myString[-1]

In [0]:
## CODE CELL 22

myString[-2]

To access larger portions (called "slices") of the string, we can use the following syntax: $string[startIndex:endIndex:stepSize]$. The string returned will start from the character at index $startIndex$, but it will end with the character at index $endIndex-1$. If not specified, $stepSize$ = 1, $startIndex$ = 0, and $endIndex$ = one beyond the last index.
For example,

In [0]:
## CODE CELL 23

myString[5:11]

In [0]:
## CODE CELL 24

myString[1:11:2]

In [0]:
## CODE CELL 25

myString[:11]

In [0]:
## CODE CELL 26

myString[11:]

In [0]:
## CODE CELL 27

myString[:]

What do you think this will yield?

In [0]:
## CODE CELL 28

myString[-3:11:-1]

Let's look at other useful string operations.

In [0]:
## CODE CELL 29
# How long is the string?

len(myString)

In [0]:
## CODE CELL 30
# Concatenation

myString2 = '; it was originally "Queen\'s College".'
print(myString + myString2)

**Exercise 1:**

Take the following two "sentences": (1) Next bus: 7 minutes. (2) Travel time on bus: 5 minutes. Using string indexing, extract the number of minutes for each activity, add the two numbers to get the total time (variable *total*), and print the following message using concatentation: "Next bus: 7 minutes. Travel time on bus: 5 minutes. Total time to destination: *total*".

*Hint: You will have to do at least one data type conversion.*

**Answer 1:**

In [0]:
## CODE CELL 31

sentence1 = 'Next bus: 7 minutes.'
sentence2 = 'Travel time on bus: 5 minutes.'
sentence3 = 'Total time to destination:'

## ENTER CODE HERE 





There's a lot you can do with strings! Let's go through a few useful methods:

In [0]:
## CODE CELL 32
# Converting to all lower case

new = 'RU'
new.lower()

In [0]:
## CODE CELL 33
# Converting to all upper case

new.lower().upper()

Notice that you can use multiple methods in the same line; in the above cell, the method `str.upper()` is executed on the string to the left of the dot before `upper()`, which in this case is `new.lower()`.

In [0]:
## CODE CELL 34
# Removing leading and trailing characters

s = 'aaaabcccHow are you doing?aayzz'
s.strip('aaaabccc')

In [0]:
## CODE CELL 35
# Removing leading and trailing whitespace

t = '  Could be worse.  '
t = t.strip()

In [0]:
## CODE CELL 36
# Removing leading characters only

u = '???How is the family?'
u.lstrip('?')

In [0]:
## CODE CELL 37
# Removing trailing characters only

v = '...Alice twisted her ankle playing basketball on Saturday...'
v.rstrip('.')

In [0]:
## CODE CELL 38
# Finding the first position(index) of a character 

v.find('l')

In [0]:
## CODE CELL 39
# str.find() can also be used for a substring; it returns the index of the first character in the substring

v.find('Alice')

In [0]:
## CODE CELL 40
# Replacing all instances of a character or substring

y = 'Day 1: Prep. Day 2: Execute. Day 3: Review.'
y.replace('Day ', '')

Notice that you can remove characters from a string by substituting in an empty string. Strings are immutable; that is, you can't add characters to or remove characters from a string. You can create a new string through concatenation or slicing that has more or fewer characters, **but the original string cannot be altered**. Whenever you apply a method to a string, you're just creating a new string. (You can assign the new string to the original variable and thereby appear to "change" the original string, but you are still creating a new string in the process. The original variable simply stores the new string instead of the old string.)

In [0]:
## CODE CELL 41

y    # y still has its original value

In [0]:
## CODE CELL 42
# To "save your changes", assign the new string to a variable

new_y = y.replace('Day ', '')
new_y

In [0]:
## CODE CELL 43
# Splitting a string into a list of words

z = 'Once upon a time in a land far, far away...'
z.split()

** The escape character **

Let's look at the following string:

In [0]:
## CODE CELL 44
# What happens when you try to print it?

print('We will meet at 4 o'clock.')

What happened? Essentially, once you start a string with a single quote, Python will take the next single quote it sees as a signal to end the string (and similarly with double quotes). Here, that means that {clock.} is seen as something undefined that is outside the string - and Python does not like this. 

There are two ways around this. One is to use the "other" quotation mark type inside the string - so if you're using single quotes inside the string, use double quotes around the string, and vice versa. The other is "escaping" the internal quotation marks by using a backslash ("\") just before the quotation mark like so:

In [0]:
## CODE CELL 45
# Python is ok with this

print('We will meet at 4 o\'clock.')

This way, we're letting Python know that the internal single quotes should be treated as part of the ongoing string.


Finally, how can we introduce whitespace characters like tabs into a string? Answer: we use escape sequences.

In [0]:
## CODE CELL 46
# To introduce a tab character, use \t

score = 'Home:\t16'
print(score)

In [0]:
## CODE CELL 47
# To introduce a newline character (similar to pressing Enter on your keyboard), use \n

scores = 'Home:\t16\nAway:\t24'
print(scores)

### Lists: working with a data collection

Many of the operations we used with strings can be applied to lists as well, including
- indexing
- slicing
- finding the length
- concatenation

In [0]:
## CODE CELL 48
# Creating a list of the top 40 U.S. cities by population

topcities = ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Philadelphia', 'Phoenix', 'San Antonio', 'San Diego',
         'Dallas', 'San Jose', 'Austin', 'Jacksonville', 'San Francisco', 'Indianapolis', 'Columbus', 'Fort Worth',
         'Charlotte', 'Seattle', 'Denver', 'El Paso', 'Detroit', 'Washington', 'Boston', 'Memphis', 'Nashville', 'Portland',
         'Oklahoma City', 'Las Vegas', 'Baltimore', 'Louisville', 'Milwaukee', 'Albuquerque', 'Tucson', 'Fresno', 'Sacramento',
         'Kansas City', 'Long Beach', 'Mesa', 'Atlanta', 'Colorado Springs']
print(topcities)

In [0]:
## CODE CELL 49
# Which is the 5th most populous city?

topcities[4]

In [0]:
## CODE CELL 50
# Which cities are ranked #11-#20?

topcities[10:20]

In [0]:
## CODE CELL 51
# Are there really 40 cities in the list?

len(topcities)

In [0]:
## CODE CELL 52
# Let's add the next 5 cities to topcities

cities41to45 = ['Virginia Beach', 'Raleigh', 'Omaha', 'Miami', 'Oakland']
topcities + cities41to45

However, **lists, unlike strings, are mutable** - their identities can be changed in-place without creating a new list.

In [0]:
## CODE CELL 53
# Another way to add multiple items to the list is to use the "extend" method

topcities.extend(cities41to45)
len(topcities)
#print(topcities)

In [0]:
## CODE CELL 54
# You can also add items one at a time using the "append" method

topcities.append('Minneapolis')
topcities[-2:]   # let's just look at the end of the list

In [0]:
## CODE CELL 55
# Is Orlando in the list?

'Orlando' in topcities

In [0]:
## CODE CELL 56
# Is Dallas in the list?

'Dallas' in topcities

In [0]:
## CODE CELL 57
# Which position is Dallas in?

topcities.index('Dallas')

In [0]:
## CODE CELL 58
# Let's get the list in alphabetical order

topcities.sort()
topcities[:10]   # just looking at the first 10 to verify sorting

In [0]:
## CODE CELL 59
# Sorting also works on numbers

newList = [3,53,7,768,7,4,563]
newList.sort()
newList

You can even create a list of lists.

In [0]:
## CODE CELL 60
# Indexing with a list of lists

nestedList = [[1,2,3],[2,4,6],[3,6,9],[4,8,12]]
print(nestedList[0])

In [0]:
## CODE CELL 61
# Accessing a single element in a sublist

print(nestedList[0][2])

In [0]:
## CODE CELL 62
# Slicing

print(nestedList[:2])

There's a lot you can do with lists. A brief overview can be found here: https://www.tutorialspoint.com/python/python_lists.htm; full documentation can be found at the official Python documentation page.

**Exercise 2:**

You have a class of students whose scores on the last exam were as follows: 89, 79, 83, 85, 95, 50, 77, 90, 100, 91, 69, 87, 93, 88. Find the median score by taking the average of the two "middle" scores. 

*Hint: First, sort the scores. To find the locations of the "middle" scores, first determine the number of scores you're dealing with using the len() function.*

**Answer 2:**

In [0]:
## CODE CELL 63

scores = [89, 79, 83, 85, 95, 50, 77, 90, 100, 91, 69, 87, 93, 88]

## ENTER CODE HERE 


*Note:*

Jupyter Notebook is a more interactive environment, and that means that you can get away with just typing the name of a variable and running the cell to get the value stored in that variable as output. If you're writing code in a script (e.g. a .py file such as those created in the IDLE or Spyder Python environments), you will need to use the `print()` function to explicitly show the "contents" of the variable.

## Now the real fun begins...
Before we start playing with data files, we need to cover one more really important section.
### Loops, conditionals, and functions
If you have used a progamming language before, you're probably familiar with the for-loop. For everyone else, a for-loop is a way of iterating through a data structure - a string, a list, a dictionary, etc. - or file. It's a way to execute the same piece of code multiple times with a parameter being updated on every iteration.
What is data structure? In computer science, a data structure is a data organization, management, and storage format that enables efficient access and modification. 

In [0]:
## CODE CELL 64
# The go-to first example of a for-loop

for i in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]:
    print(i)

The `range()` function is useful here. It can be given up to three parameters: $([start], stop, [step])$ (brackets indicate optional parameters). As with slicing in strings and lists, the loop will stop at $stop-1$.

In [0]:
## CODE CELL 65
# Do the above more efficiently using the "range" function:
## End at stop-1

for i in range(1,11, 2):
    print(i)

You may be wondering why we chose the letter "i" to iterate through the lists above. The answer is that it's just convention - letters like "i" and "j" are often used for iteration, but in practice, you can use any letters, letter + number combinations, or even an underscore.

One common use of a for-loop is to iteratively append elements to a list.

In [0]:
## CODE CELL 66
# What are the first ten multiples of 3?

multiples = []    # initializing list
for i in range(1,11):
    multiples.append(3*i)

print(multiples)

You can also use multiple layers of loops (called "nested loops").

In [0]:
## CODE CELL 67
# What are the first five multiples of numbers 10-14?

for i in range(10,15):
    myList = []
    for j in range(1,6):
        myList.append(i*j)
    print(myList)

As you begin writing more complex code, you may find it helpful to use this step-by-step visualization tool: http://pythontutor.com/visualize.html#. (DEMO)

What if we want to control what code gets executed based on certain conditions? That's where conditional statements come in. Let's look at some operators you'll likely use:

In [0]:
## CODE CELL 68

print(3 < 4)    # less than

In [0]:
## CODE CELL 69

print(4 <= 4)    # less than or equal to

In [0]:
## CODE CELL 70

print('a' == 'A')    # equal to

In [0]:
## CODE CELL 71

print(4 != 4)    # not equal to

In [0]:
## CODE CELL 72

print(2 >= 10)     # greater than or equal to

In [0]:
## CODE CELL 73

print('a' > 'A')    # greater than

You may have noticed some unexpected behavior. When comparing strings, these operators indicate how Python sorts them lexicographically - 'a', 'b', etc. come *after* 'A', 'B', etc.

The return values of True/False are called "Booleans". These are actual values that can be assigned to a variable:

In [0]:
## CODE CELL 74

var = True
new_var = False

print('var:', var, '\nnew_var:', new_var)

In Part 1, we saw another case where the result was a Boolean; we were checking if "Orlando" was in the $topcities$ list. The `in` and `not in` membership checks return True or False.``

Now, we can implement some of these comparisons in what's called an if-else statement. The gist of it is this: if {some condition}, execute some code; for all other cases, execute some other code.

In [0]:
## CODE CELL 75
# On which days could we potentially have a picnic?

forecast7Day = ['rain', 'mostly cloudy', 'rain', 'mostly cloudy', 'sunny', 'partly cloudy', 'rain']
picnic = []
for i in forecast7Day:
    if i == 'rain':
        picnic.append('no')
    else:
        picnic.append('yes')
        
print(picnic)

We can include more "options" by incorporating "elif" statements.

In [0]:
## CODE CELL 76
# How many layers do I need to wear for the next few days?

forecastTemps = [50, 59, 72, 74, 60, 62, 63]
layers = []
for i in forecastTemps:
    if i < 60:
        layers.append('wear a jacket')
    elif i >= 70:
        layers.append("don't need a jacket or sweater")
    else:
        layers.append('wear a sweater')

print(layers)

You can incorporate as many elif statements as you'd like.

One last type of control structure - the while-loop. The general structure is the following: while {some condition}, execute some code. Iteration will continue until that condition is no longer true.

In [0]:
## CODE CELL 77
# Using up a gift card

balance = 110     # initial balance = $110
while balance - 20 >= 0:
    print('Your balance is now $' + str(balance))
    balance -= 20    # using up $20 for each purchase
print('Final balance: $' + str(balance))

Finally, functions. Functions are extremely useful for when you want to execute a section of code repeatedly, but with parameters (called "arguments") for which values can be defined when the function is called. Functions are defined with `def` and then a user-provided name.

In [0]:
## CODE CELL 78
# A function to generalize the gift card code in the previous example

def giftCard(init_balance, purchase_size):    # this function has two arguments
    balance = init_balance
    while balance - purchase_size >= 0:
        print('Your balance is now $' + str(balance))
        balance -= purchase_size
    print( 'Final balance: $' + str(balance))

What happened when you ran the previous cell?

In order to use the function, we have to call it.

In [0]:
## CODE CELL 79
# Calling the giftCard function

newCard = giftCard(200, 50)    # initial balance = $200, purchase_size = $50
print(newCard)

Try calling `giftCard()` with different parameters.


**Exercise 3:**

How many words are in the first sentence of Charles Dickens's *Oliver Twist*? How many times does the word "which" appear?

*Hint: The punctuation characters in this paragraph are given in the remove_punc list. Think about using the `replace` and `split` string methods from Part 1 of the workshop to get a list of words without punctuation.*

In [0]:
## CODE CELL 17

oliverT = '''Among other public buildings in a certain town, which for many reasons
it will be prudent to refrain from mentioning, and to which I will
assign no fictitious name, there is one anciently common to most towns,
great or small: to wit, a workhouse; and in this workhouse was born; on
a day and date which I need not trouble myself to repeat, inasmuch as
it can be of no possible consequence to the reader, in this stage of
the business at all events; the item of mortality whose name is
prefixed to the head of this chapter.'''

remove_punc = [',', ':', ';', '.']

## ENTER CODE HERE

*References*:

The following materials were consulted during development of this notebook:

J. Zelle, *Python Programming: An Introduction to Computer Science*, 2nd ed. Sherwood, Oregon: Franklin, Beedle & Associates Inc., 2010.

Python 3 Documentation from the Python Software Foundation: https://docs.python.org/3/