# Python review: Values, variables, types, lists, and strings

These first few notebooks are a set of exercises with two goals:

1. Review the basics of Python
2. Familiarize you with Jupyter

Regarding the first goal, these initial notebooks cover material we think you should already know from [Chris Simpkins's](https://www.cc.gatech.edu/~simpkins/) [Python Bootcamp from Fall 2018](http://datamastery.gitlab.io/msabc/august2018.html). This bootcamp is what students on the on-campus Georgia Tech MS Analytics students took.

Regarding the second goal, you'll observe that the bootcamp has each student install and work directly with the Python interpreter, which runs locally on his or her machine (e.g., see the `Video: Getting Started` link and Slide 7 of his Intro to Python slides). But in this course, we are using Jupyter Notebooks as the development environment. You can think of a Jupyter notebook as a web-based "skin" for running a Python interpreter---possibly hosted on a remote server, which is the case in this course. Here is a good tutorial on [Jupyter](https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook).

> **Note for [MS Analytics](http://analytics.gatech.edu) students.** In this course we assume you are using [Vocareum's deployment](https://www.vocareum.com/) of Jupyter. You also have an option to use other Jupyter environments, including installing and running Jupyter on your own system. We can't provide technical support to you if you choose to go those routes, but if you'd like to do that anyway, we recommend [Microsoft Azure Notebooks](https://notebooks.azure.com/) as a web-hosted option or the Continuum Analytics [Anaconda distribution](https://www.continuum.io/downloads) as a locally installed option.

**Study hint: Read the test code!** You'll notice that most of the exercises below have a place for you to code up your answer followed by a "test cell." That's a code cell that checks the output of your code to see whether it appears to produce correct results. You can often learn a lot by reading the test code. In fact, sometimes it gives you a hint about how to approach the problem. As such, we encourage you to try to read the test cells even if they seem cryptic, which is deliberate!

**Exercise 0** (1 point). Run the code cell below. It should display the output string, `Hello, world!`.

In [1]:
print("Hello, world!")

Hello, world!


**Exercise 1** (`x_float_test`: 1 point). Create a variable named `x_float` whose numerical value is one (1) and whose type is *floating-point*.

In [2]:
#
# YOUR CODE HERE
#
x_float=float(1)

In [3]:
# `x_float_test`: Test cell
assert x_float == 1
assert type(x_float) is float
print("\n(Passed!)")


(Passed!)


**Exercise 2** (`strcat_ba_test`: 1 point). Complete the following function, `strcat_ba(a, b)`, so that given two strings, `a` and `b`, it returns the concatenation of `b` followed by `a` (pay attention to the order in these instructions!).

In [4]:
def strcat_ba(a, b):
    assert type(a) is str
    assert type(b) is str
#
# YOUR CODE HERE
    return b+a


In [5]:
# `strcat_ba_test`: Test cell

# Workaround:  # Python 3.5.2 does not have `random.choices()` (available in 3.6+)
def random_letter():
    from random import choice
    return choice('abcdefghijklmnopqrstuvwxyz')

def random_string(n, fun=random_letter):
    return ''.join([str(fun()) for _ in range(n)])

a = random_string(5)
b = random_string(3)
c = strcat_ba(a, b)
print('strcat_ba("{}", "{}") == "{}"'.format(a, b, c))
assert len(c) == len(a) + len(b)
assert c[:len(b)] == b
assert c[-len(a):] == a
print("\n(Passed!)")

strcat_ba("lwcdw", "emv") == "emvlwcdw"

(Passed!)


**Exercise 3** (`strcat_list_test`: 2 points). Complete the following function, `strcat_list(L)`, which generalizes the previous function: given a *list* of strings, `L[:]`, returns the concatenation of the strings in reverse order. For example:

```python
    strcat_list(['abc', 'def', 'ghi']) == 'ghidefabc'
```

In [6]:
def strcat_list(L):
    assert type(L) is list
    #
    # YOUR CODE HERE
    reverse_L=L[::-1]
    return ''.join([i for i in reverse_L])
    #


In [7]:
# `strcat_list_test`: Test cell
n = 3
nL = 6
L = [random_string(n) for _ in range(nL)]
Lc = strcat_list(L)

print('L == {}'.format(L))
print('strcat_list(L) == \'{}\''.format(Lc))
assert all([Lc[i*n:(i+1)*n] == L[nL-i-1] for i, x in zip(range(nL), L)])
print("\n(Passed!)")

L == ['czf', 'eem', 'yzs', 'oyb', 'ykl', 'feo']
strcat_list(L) == 'feoykloybyzseemczf'

(Passed!)


**Exercise 4** (`floor_fraction_test`: 1 point). Suppose you are given two variables, `a` and `b`, whose values are the real numbers, $a \geq 0$ (non-negative) and $b > 0$ (positive). Complete the function, `floor_fraction(a, b)` so that it returns $\left\lfloor\frac{a}{b}\right\rfloor$, that is, the *floor* of $\frac{a}{b}$. The *type* of the returned value must be `int` (an integer).

In [8]:
20.0//3

6.0

In [9]:
def is_number(x):
    """Returns `True` if `x` is a number-like type, e.g., `int`, `float`, `Decimal()`, ..."""
    from numbers import Number
    return isinstance(x, Number)
    
def floor_fraction(a, b):
    assert is_number(a) and a >= 0
    assert is_number(b) and b > 0
    #
    # YOUR CODE HERE
    return int(a//b)
    #


In [10]:
# `floor_fraction_test`: Test cell
from random import random
a = random()
b = random()
c = floor_fraction(a, b)

print('floor_fraction({}, {}) == floor({}) == {}'.format(a, b, a/b, c))
assert b*c <= a <= b*(c+1)
assert type(c) is int
print('\n(Passed!)')

floor_fraction(0.4722555366624973, 0.17964581014188075) == floor(2.6288146452707086) == 2

(Passed!)


**Exercise 5** (`ceiling_fraction_test`: 1 point). Complete the function, `ceiling_fraction(a, b)`, which for any numeric inputs, `a` and `b`, corresponding to real numbers, $a \geq 0$ and $b > 0$, returns $\left\lceil\frac{a}{b}\right\rceil$, that is, the *ceiling* of $\frac{a}{b}$. The type of the returned value must be `int`.

In [11]:
def ceiling_fraction(a, b):
    assert is_number(a) and a >= 0
    assert is_number(b) and b > 0
    #
    # YOUR CODE HERE
    #
    return int(a//b+1)


In [12]:
# `ceiling_fraction_test`: Test cell
from random import random
a = random()
b = random()
c = ceiling_fraction(a, b)
print('ceiling_fraction({}, {}) == ceiling({}) == {}'.format(a, b, a/b, c))
assert b*(c-1) <= a <= b*c
assert type(c) is int
print("\n(Passed!)")

ceiling_fraction(0.8838344615530194, 0.38204964300632926) == ceiling(2.3134021395705826) == 3

(Passed!)


**Exercise 6** (`report_exam_avg_test`: 1 point). Let `a`, `b`, and `c` represent three exam scores as numerical values. Complete the function, `report_exam_avg(a, b, c)` so that it computes the average score (equally weighted) and returns the string, `'Your average score: XX'`, where `XX` is the average rounded to one decimal place. For example:

```python
    report_exam_avg(100, 95, 80) == 'Your average score: 91.7'
```

In [21]:
def report_exam_avg(a, b, c):
    assert is_number(a) and is_number(b) and is_number(c)
    #
    # YOUR CODE HERE
    avg=round((a+b+c)/3,1)
    return('Your average score: {}'.format(avg))
    #
    


In [22]:
# `report_exam_avg_test`: Test cell
msg = report_exam_avg(100, 95, 80)
print(msg)
assert msg == 'Your average score: 91.7'

print("Checking some additional randomly generated cases:")
for _ in range(10):
    ex1 = random() * 100
    ex2 = random() * 100
    ex3 = random() * 100
    msg = report_exam_avg(ex1, ex2, ex3)
    ex_rounded_avg = float(msg.split()[-1])
    abs_err = abs(ex_rounded_avg*3 - (ex1 + ex2 + ex3)) / 3
    print("{}, {}, {} -> '{}' [{}]".format(ex1, ex2, ex3, msg, abs_err))
    assert abs_err <= 0.05

print("\n(Passed!)")

Your average score: 91.7
Checking some additional randomly generated cases:
18.479883382959294, 75.3189624951411, 55.78011679849961 -> 'Your average score: 49.9' [0.04034577446665821]
12.169766907636392, 98.41722258525346, 19.500675249838906 -> 'Your average score: 43.4' [0.0374450857570802]
91.44936218532253, 88.62828371787083, 19.55509416516361 -> 'Your average score: 66.5' [0.04424668945231739]
64.47411832909596, 64.5151909667317, 58.880886976450874 -> 'Your average score: 62.6' [0.023398757426169443]
61.88575035483418, 43.6382439929617, 15.809364684676641 -> 'Your average score: 40.4' [0.044453010824175486]
19.44898363553036, 62.39302579080124, 73.43770833945148 -> 'Your average score: 51.8' [0.040094078072305216]
38.783938440557776, 47.009564040720534, 57.87482098438046 -> 'Your average score: 47.9' [0.010558844780405252]
72.3927907658938, 13.372093864851964, 52.1794561522625 -> 'Your average score: 46.0' [0.018553072330576015]
92.63342807459838, 50.161398310620775, 79.18187598902

**Exercise 7** (`count_word_lengths_test`: 2 points). Write a function `count_word_lengths(s)` that, given a string consisting of words separated by spaces, returns a list containing the length of each word. Words will consist of lowercase alphabetic characters, and they may be separated by multiple consecutive spaces. If a string is empty or has no spaces, the function should return an empty list.

For instance, in this code sample,

```python
   count_word_lengths('the quick  brown   fox jumped over     the lazy  dog') == [3, 5, 5, 3, 6, 4, 3, 4, 3]`
```

the input string consists of nine (9) words whose respective lengths are shown in the list.

In [27]:
test='the quick  brown   fox jumped over     the lazy  dog'
[len(i) for i in test.split()]


[3, 5, 5, 3, 6, 4, 3, 4, 3]

In [31]:
def count_word_lengths(s):
    assert all([x.isalpha() or x == ' ' for x in s])
    assert type(s) is str
    #
    # YOUR CODE HERE
    #
    import re
    str1=re.compile('[a-z]+')
    str_check=str1.findall(s)
    if len(str_check) in [0,1]:
        return []
    else:
        return([len(i) for i in str_check])

    

In [32]:
s1='the better would be the best best best'
s2='I like football'
str_set=s1.split()+s2.split()
str_all={i:0 for i in str_set}
s1_str=str_all.copy()
s2_str=str_all.copy()
for i in s1.split():
    print(i)
    s1_str[i]+=1      
s1_str


the
better
would
be
the
best
best
best


{'the': 2,
 'better': 1,
 'would': 1,
 'be': 1,
 'best': 3,
 'I': 0,
 'like': 0,
 'football': 0}

In [33]:
strs=('good','ok')
a={i:0 for i in strs}
a

{'good': 0, 'ok': 0}

In [34]:
# `count_word_lengths_test`: Test cell

# Test 1: Example
qbf_str = 'the quick brown fox jumped over the lazy dog'
qbf_lens = count_word_lengths(qbf_str)
print("Test 1: count_word_lengths('{}') == {}".format(qbf_str, qbf_lens))
assert qbf_lens == [3, 5, 5, 3, 6, 4, 3, 4, 3]

# Test 2: Random strings
from random import choice # 3.5.2 does not have `choices()` (available in 3.6+)
#return ''.join([choice('abcdefghijklmnopqrstuvwxyz') for _ in range(n)])

def random_letter_or_space(pr_space=0.15):
    from random import choice, random
    is_space = (random() <= pr_space)
    if is_space:
        return ' '
    return random_letter()

S_LEN = 40
W_SPACE = 1 / 6
rand_str = random_string(S_LEN, fun=random_letter_or_space)
rand_lens = count_word_lengths(rand_str)
print("Test 2: count_word_lengths('{}') == '{}'".format(rand_str, rand_lens))
c = 0
while c < len(rand_str) and rand_str[c] == ' ':
    c += 1
for k in rand_lens:
    print("  => '{}'".format (rand_str[c:c+k]))
    assert (c+k) == len(rand_str) or rand_str[c+k] == ' '
    c += k
    while c < len(rand_str) and rand_str[c] == ' ':
        c += 1
    
# Test 3: Empty string
print("Test 3: Empty strings...")
assert count_word_lengths('') == []
assert count_word_lengths('   ') == []

print("\n(Passed!)")

Test 1: count_word_lengths('the quick brown fox jumped over the lazy dog') == [3, 5, 5, 3, 6, 4, 3, 4, 3]
Test 2: count_word_lengths(' hjhjurukuzxclqecqdtqkf pz leugqdh bake ') == '[22, 2, 7, 4]'
  => 'hjhjurukuzxclqecqdtqkf'
  => 'pz'
  => 'leugqdh'
  => 'bake'
Test 3: Empty strings...

(Passed!)


**Fin!** You've reached the end of this part. Don't forget to restart and run all cells again to make sure it's all working when run in sequence; and make sure your work passes the submission process. Good luck!