## Python Introduction with some detours
![Python](https://www.python.org/static/community_logos/python-logo-generic.svg)

![xkcd](https://imgs.xkcd.com/comics/python.png)

## Getting Started

## https://bit.ly/bssdh_2024_python

---

Open this Jupyter notebook (`Day 1 - Python Introduction`) on Github:
- https://github.com/CaptSolo/BSSDH_2024_beginners/tree/main/notebooks

Download the notebook file to your computer: click the "Download raw file" button.

![Download button](https://github.com/CaptSolo/BSSDH_2024_beginners/blob/main/notebooks/img/download_button.png?raw=1)

Open [Google Colab](https://colab.research.google.com/), choose the "Upload" tab and upload the downloaded notebook file.

* Uploaded notebooks can be found in the [Google Drive](https://drive.google.com/) folder `Colab Notebooks`

*Alternative: you may also open the whole Github repository in Google Colab by selecting "Github" in the "Open notebook" screen.*

---

You will also need a free ChatGPT account:
- https://chatgpt.com/

---

You are now ready for the workshop!

## Hello World <a class="anchor" id="hello-world">

Execute the following program code cells by pressing the triangle "run" icon:

![Run icon](https://github.com/CaptSolo/BSSDH_2024_beginners/blob/main/notebooks/img/hello_world.png?raw=1|200)

Note: this is a screenshot from Google Colab. In locally installed Jupyter notebooks the Run icon may appear in a different location.

---

In [None]:
print("Hello world!")

In [None]:
### Try printing a greeting of your own!
print("Some text here")

In [None]:
### What happens when you get an error?
print("not good)

## Python History <a class="anchor" id="python-history">



### Python created by Guido von Rossum in early 1990s  (later at Google, Dropbox)

![Guido](https://upload.wikimedia.org/wikipedia/commons/thumb/d/d0/Guido-portrait-2014-curvves.jpg/290px-Guido-portrait-2014-curvves.jpg)

### Language reference: https://docs.python.org/3/index.html

## Why Python?

- Readability
- Glue language (can use many libraries and services)
- Used from startups to Google and Facebook

### Python's popularity

![Python](https://github.com/CaptSolo/BSSDH_2024_beginners/blob/main/notebooks/img/python_growth.png?raw=1)

**Python is now programming language # 1 in TIOBE language index (as of July 2022)**

https://www.tiobe.com/tiobe-index/

![TIOBE index ranking](https://github.com/CaptSolo/BSSDH_2024_beginners/blob/main/notebooks/img/tiobe_index.png?raw=1)

### Batteries included principle
![Batteries](https://github.com/CaptSolo/BSSDH_2024_beginners/blob/main/notebooks/img/batteries_small.jpg?raw=1)

## What is Programming? <a class="anchor" id="what-is-programming">
    
* Egg algorithm
* Computers are stupid, they only do what they are told to do
* If it is stupid but it works, then it is not stupid
* Make it work, make it right, make it fast (last two steps often not required in real life)
* GIGO principle

* Error messages are nothing to be afraid of, usually the message will explain what needs fixing!

In [None]:
# Our first comment

# REPL(Read,Eval,Print, Loop)
# Python - Interpreted Language (commands executed as they come in)

## Python Installation

a) Python website (standalone installation) 
- https://www.python.org/downloads/

b) Google Colab (cloud service, no installation necessary - Google version of Jupyter Notebooks)
- https://colab.research.google.com/

c) Anaconda (local installation, includes many additional libraries and Jupyter Notebooks) 
- https://www.anaconda.com/download/

**Python program code can be run:**
- in Jupyter notebooks (text notebooks with executable code "cells")
  - cells can be executed and the result is printed below the code cell
- as standalone programs (.py files)
  - program code in .py files that can be executed all together, without requiring user involvement

In this workshop we will use Jupyter notebooks.

## Jupyter Basics <a class="anchor" id="jupyter-basics">

In Jupyter Notebooks you work in cells that may contain Python code or specially formatted text (Markdown).

These shortcuts work both in local Jupyter Notebooks and Google Colab:
    
* Ctrl-Enter runs code of cell in place
* Alt-Enter runs code for current cell and creates a new cell below
* Esc-A creates a new cell above current cell
* Esc-B creates a new cell below current cell
    
These shortcuts do not work in Google Colab:
* Esc-M turns cell into Markdown cell for formatting (https://guides.github.com/pdfs/markdown-cheatsheet-online.pdf)
* Esc-Y turns cell into code cell(default)
* Esc-dd deletes current cell


In [None]:
# Try Esc-B to create a new cell or click an icon for inserting a new notebook cell below

# Enter print("Hello Humanities!")
# Press Ctrl-Enter to execute the cell

# Did you get any error messages?

## Notebook Plan

- Values
- Variables and data types
- Getting user input
- Arithmetic operators
- Text strings
- Detour: ChatGPT
- Lists, dictionaries and tuples
- Program flow control
- Functions
- Libraries
- Working with files

## Values


In [None]:
# Integer value (vesels skaitlis)
42

In [None]:
# The type() function lets us determine the type of the value or a variable
type(42)


In [None]:
# Floating point value (daļskaitlis)
3.14159

In [None]:
type(3.14159)

In [None]:
# Text string value (teksts)
"Hello world"

In [None]:
type("Hello world")

In [None]:
# Text values can use both single quotes ' and double quotes "
'This is also a text string'

In [None]:
"""
This text value
consists of multiple
line
"""

In [None]:
print("""
This text value
consists of multiple
line
""")

In [None]:
# Boolean value = True or False
True

In [None]:
False

In [None]:
type(True)

In [None]:
# Boolean operation "and" returns True only if both its arguments are True
True and False

## Variables <a class="anchor" id="variables">

Variables are like named boxes that can store values

In [None]:
# Creating a variable
# It will be available through this workbook once the command is run
myname = "Uldis"

In [None]:
# Print the value of the "myname" variable
print(myname)

In [None]:
# Python commands and variable names are "case sensitive"
#  - for example, "Print" is different from "print"
Print(myName)

In [None]:
y = 2024

In [None]:
theAnswer = 42

In [None]:
myPi = 3.14159

In [None]:
# boolean variable (true/false)
isHot = True

In [None]:
# type(variableName) will return variable data type
type(theAnswer)

In [None]:
# What is the data type of myname ?
# How about data type of isHot ?

In [None]:
type(myPi)

In [None]:
type(isHot)

In [None]:
print(isHot)

In [None]:
# we can change the value stored in a variable
isHot = False

In [None]:
print(isHot)

In [None]:
# Variable names cannot be reserved keywords
help("keywords")

In [None]:
help(print)

### Getting user input

`input()` command lets the program ask for user input

In [None]:
# get input (as text string)

text = input("Enter a number: ")

In [None]:
text

In [None]:
# the result of the input() command is a text string
# let's convert it to an integer

int(text)

#### Input is a function

* a function "packages" some lines of a Python program together and gives them a name

* a function may be called with one or more arguments (e.g. text to print)
* a function may do something (e.g. print some text)
* a function may return a value (e.g. user input)

Other functions:

* `print()`
* `type()`

Fuctions are called by their name followed by an opening parenthesis, a list of arguments and a closing parenthesis:

- `print(text)`

#### Your task: ask for a user's name and print a greeting

1) ask a user to input their name
2) print a greeting, addressing the user by name

### Data types in Python 3.x

* Integers
  * type(42)
  * int
* Floating Point (decimal numbers)
  * type(3.14)
  * float
* Boolean
  * type(True),type(False)
  * bool
* String(ordered, immutable char sequence)
  * type("OyCaramba")
  * str
* List
  * type([1,2,63,"aha","youcanmixtypeinsidelist", ["even","nest"]])
  * list
* Dictionary(key:value pairs)
  * type({"foo":"bar", "favoriteday":"Friday"})
  * dict
* Tuple - ordered immutable sequence
  * type(("sup",7,"dwarves"))
  * tuple
* Set (unordered collection of unique values)
  * type({"p", "o", "t", "a", "t", "o"})
  * set

## More on variables
https://realpython.com/python-variables

## Arithmetic Operators <a class="anchor" id="arithmetic">

* `+ - * / `
* `**(power)`
* `% modulus`
* `//(integer division)`
* `() parenthesis for order`


In [None]:
5*3 + 4*3 - (6/2)

In [None]:
5/2

In [None]:
5//2 # gives you whole - integer division


In [None]:
5 % 2 # this gives you remainder / technically called modulo

In [None]:
4%3

In [None]:
type(1)

In [None]:
type(14.0)

In [None]:
5**33 # 5 to 33rd power

In [None]:
# no maximum anymore
11**120

In [None]:
# Googol
10**100

In [None]:
big_num = 10**100
big_num

In [None]:
string_num = str(big_num) # we can convert anything to a string
string_num

In [None]:
string_num.count("0")

## More on operators
https://www.w3schools.com/python/python_operators.asp

## Strings
* immutable
* Unicode support


* implement all common sequence operators
https://docs.python.org/3/library/stdtypes.html#typesseq-common

* string specific methods
https://docs.python.org/3/library/stdtypes.html#string-methods

In [None]:
print(myname)

In [None]:
# String length
len(myname)

In [None]:
# How is this different from "myname" ?
len("myname")

In [None]:
# Getting individual characters (counting from 0)
myname[1]

In [None]:
# Getting the last character
myname[-1]

### String Slicing

In [None]:
# Slicing lets us get parts (slices) of text values

# Slicing syntax
# Start at 0 an go until but not including 3
myname[0:3]

In [None]:
# We can omit the default start value (0) 
myname[:3]

In [None]:
myname[1:3]

In [None]:
# last three characters
myname[-3:] # so we slice from 3 characters in the end and go towards that end

In [None]:
myname[-3:-1] # this will give 3rd and 2nd character from the end BUT not the last one

In [None]:
print(myname[4])  # Python offers two indexes one from the start starting at 0, and one from the end starting at -1
print(myname[-1])

In [None]:
myname[9000]

In [None]:
# slicing syntax actually has a 3rd optional modifier - step
myname[0:6:2]

In [None]:
myname[::2] # so I want to print all letters starting from start but skip every 2nd one

In [None]:
# lets play with food!
food = "potatoes"
food[::2]

In [None]:
food[1::2] # we start with a 2nd letter and skip every 2nd letter from then on

In [None]:
food[1:6:2]

In [None]:
# Pythonic way of reversing a string
food[::-1]


In [None]:
food[2] = "x"    # this would not work

In [None]:
# modifying strings
# unmutability
# food[2]="x" is not allowed
newfood = food[:2] + "x" + food[3:] # so we concatanate first 2 letters from food then add letter "x" and then rest of letters from food starting with the 4th characters
newfood

In [None]:
last_name = "Bojārs"
last_name

In [None]:
full_name = myname + last_name
full_name

In [None]:
full_name = myname + " " + last_name
full_name

## "f-strings", “formatted string literals”

In some other languages also known as string interpolation

In [None]:
# Create myname and favfood variables with the appropriate text
# Then run the print function below

myname

In [None]:
favfood = "pizza"
favfood

In [None]:
print(f"My name is {myname} and my favorite food is {favfood} ")

# f-strings were added in Python 3.6+ (older formatting methods not covered in this course)
# https://realpython.com/python-f-strings/


In [None]:
print("My name is {myname} and my favorite food is {favfood} ") # just a regular string and no variables will be replaced

In [None]:
# Old string concatenation method
print("My name is " + myname + " and my favorite food is " + favfood)

In [None]:
food_string = f"My name is {myname} and my favorite food is {favfood}"
food_string

## Detour: ChatGPT

**ChatGPT is a powerful system that can generate program code and human-like text, and can participate in conversations with the user.**

It is based on a large language model that predicts new tokens (words) in a sentence becased on the existing text.

**Get a free ChatGPT account here:**
 - https://chatgpt.com/

**How you can use ChatGPT in programming:**
 - help with program error messages (e.g. Python exceptions)
 - explaining program code
 - writing programs
 - ...

**Limitations of ChatGPT:**
 - hallucinations (e.g. ChatGPT and similar services can "invent" Python libraries that do not exist)
 - bias (the model reflects biases in the data that it was trained on)
 - knowledge cut-off (the model does not have knowledge about recent events, ...)

---

In [None]:
# 1) Use ChatGPT to get help with the error message from the print error example
#    at the beginning of this notebook


In [None]:
# 2) Use ChatGPT to explain Python program code that you need help with:
#    e.g. the code for using Python f-strings


In [None]:
# 3) Ask ChatGPT to write a simple program


---

## String Methods

String methods lets us do something with text string values (e.g. transform them).

In [None]:
myname

In [None]:
help(str)

In [None]:
help(str.capitalize)

In [None]:
some_string = "aBBa"
some_string

In [None]:
some_string.capitalize()

In [None]:
some_string.upper()

In [None]:
some_string.count("B")

In [None]:
help(str.count)

In [None]:
"BBBB".count("BB") # the count method will not consider overlapping matches

In [None]:
some_string.endswith("Ba")

In [None]:
some_string.startswith("aB")

In [None]:
# we can use the result of a method call (in this case: True) in the "if" command 
# to execute some code when the result is True

if some_string.endswith("Ba"):
    print("This string ends with 'Ba'")

In [None]:
some_string.find("B") # returns index of first occurence

In [None]:
myname.find("dis")

In [None]:
myname[2:]

In [None]:
myname.lower() # everything to lowercase

In [None]:
myname.upper()

In [None]:
myname.replace("U", "Va")

In [None]:
myname # strings are not Mutable (can not be changed "in place")! Here's what we can do:

In [None]:
new_name = myname.replace("U", "Va")
new_name

In [None]:
myname

In [None]:
# we can overwrite the old variable (myname)
myname = myname.replace("U", "Va")
myname

In [None]:
"quick brown fox".title()

In [None]:
"quick brown fox".capitalize()

In [None]:
sentence = "A quick brown fox jumped over       a sleeping dog"
sentence

In [None]:
"fox" in sentence

In [None]:
# we can get a list of words by splitting the string by any amount of whitespace including newlines, tabs, etc.
words = sentence.split() 

words

In [None]:
# join a list of words using a single whitespace as the joining element
joined_string = " ".join(words)
joined_string

In [None]:
joined_string = " ||| ".join(words) # you can join by multiple characters
joined_string

In [None]:
smiley_string = " 😀 ".join(words) # can use any Unicode characters
smiley_string

## Python Lists

* Ordered
* Mutable(can change individual members!)
* Comma separated between brackets [1,3,2,5,6,2]
* Can have duplicates
* Can be nested


In [None]:
# the range command creates a sequence of integers from the start value to the end value (not included)
range(11,21,1)

In [None]:
# we can convert the range of integers to a list
mylist = list(range(11,21,1))

# mylist = [11, 12, 13, 14, 15, 16, 17, 18, 19, 20] would work too but not practical for longer ranges...

print(mylist)

In [None]:
shopping_list = ["Chocolate", "Milk", "Cookies"] # we created a list of 3 strings - can be changed later on as needed
shopping_list

In [None]:
# get an individual element of a list
# (counting from 0)
shopping_list[1]

In [None]:
# use negative numbers to count elements from the end of the list
# (e.g. -1 for the last element)
shopping_list[-1]

In [None]:
# you can change the value of list elements, too
shopping_list[-1] = "Potatoes"
shopping_list

### Slice notation

somestring[start:end:step]

somelist[start:end:step]

start is at index 0(first element), end is -1 the actual index
#### Examples below

In [None]:
mylist

In [None]:
mylist[3:] # starting from the 4th element(with index 3)

In [None]:
mylist[:-2] # everything BUT the last two items

In [None]:
mylist[3:6] # from element with index 3 to element with index 6 (not included)

In [None]:
# step 2 = every other element
mylist[::2]

In [None]:
# negative step value returns a list in a reverse order
mylist[::-1]

In [None]:
# slicing also works on string values
myname[::-1]


### Common list methods.
* list.append(elem) -- adds a single element to the end of the list. Common error: does not return the new list, just modifies the original.
* list.insert(index, elem) -- inserts the element at the given index, shifting elements to the right.
* list.extend(list2) adds the elements in list2 to the end of the list. Using + or += on a list is similar to using extend().
* list.index(elem) -- searches for the given element from the start of the list and returns its index. Throws a ValueError if the element does not appear (use "in" to check without a ValueError).
* list.remove(elem) -- searches for the first instance of the given element and removes it (throws ValueError if not present)
* list.sort() -- sorts the list in place (does not return it). (The sorted() function shown later is preferred.)
* list.reverse() -- reverses the list in place (does not return it)
* list.pop(index)-- removes and returns the element at the given index. Returns the rightmost element if index is omitted (roughly the opposite of append()).

In [None]:
shopping_list

In [None]:
shopping_list.append("Kefir")
shopping_list

In [None]:
shopping_list.sort() # this will sort a list
shopping_list

In [None]:
shopping_list.append("fork")
shopping_list

In [None]:
shopping_list = shopping_list[:-3] # remove last 3 items
shopping_list

In [None]:
shopping_list.remove("Chocolate")
shopping_list

In [None]:
"Kefir" in shopping_list  # this needs to be exact match

In [None]:
"kefir" in shopping_list # case sensitive so it should be false

In [None]:
"Milk" in shopping_list

In [None]:
# The "in" command also works on text strings
"dis" in myname

In [None]:
myname = "Uldis"

In [None]:
"al" in myname # Uldis does not have the "al" substring

#### Your task: let the user enter some text and output the number of words in this text

*Task 2 (optional): Let the user enter 2 numbers separated by a space and print out a sum of these numbers*


## Dictionaries

* Collection of Key - Value pairs
* also known as associative array
* unordered
* keys unique in one dictionary
* storing, extracting

In [None]:
# Dictionary is a key-value store, also known as Hashmaps
# Keys must be unique
mydict = {"country": "Latvia",
          "city": "Riga"} 
mydict

In [None]:
mydict["city"]

In [None]:
# adding and overwriting key value pair
mydict["food"]="potatoes" # if key "food" does not exist it will create this pairing, if it exists it will overwrite
mydict

In [None]:
mydict["food"] # can access values by key very very quickly

In [None]:
mydict["country"]

In [None]:
mydict.keys()

In [None]:
mydict.values()

In [None]:
"potatoes" in mydict.values() # this will be slower in larger dictionaries

In [None]:
"country" in mydict # same as "country" in mydict.keys(), this is quick no matter the dictionary size

In [None]:
country_dict = {"countries" : [{"country":"Latvia", "food":"Potatoes"},{"country":"Estonia"}], "cities":["Riga","Tallinn","Kyiv"]}
country_dict

In [None]:
# let's import a pprint library for better print formatting
from pprint import pprint

pprint(country_dict)

In [None]:
country_dict["countries"] # we get a list of dictionaries

In [None]:
country_dict["countries"][0] # the inner dictionary with info about Latvia is first in the list

In [None]:
country_dict["countries"][0]["food"]

## Tuples

* ordered
* immutable (cannot be changed!)
  * otherwise similar to lists

In [None]:
mytuple = (6, 4, 9)
print(mytuple)

In [None]:
mytuple2 = 4, 9, 16  # tuples can be defined without parenthesis, too
print(mytuple2)

In [None]:
mytuple[2] # can access tuple elements by index

In [None]:
mytuple[2] = 8  # changing an element of a tuple will not work

Tuples can be used as dictionary keys or as a collection of values (e.g. for returning multiple values from a function).

*We will look at defining functions later on.*

In [None]:
def return_multiple():
    return 1, 4, 9

result = return_multiple()

In [None]:
print(result)

print()
print("Accessing the 1st element of the returned tuple:")
print(result[1])

## Flow Control <a class="anchor" id="flow-control">


In [None]:
# With Flow Control we can tell our program/cells to choose different paths
# or to run some commands repeatedly

## Conditional operators

`< > <= >= == != and or not`

In [None]:
# What is truth in computer language?

In [None]:
2*2 == 4

In [None]:
5 > 7


In [None]:
print(5 == int('5'))

print(5 <= 6)

In [None]:
print(5 <= 5)

# check if 5 is NOT equal to 6
print(5 != 6)

print(5 != 5)


In [None]:
# We check each letter from left side
# on mismatch we check ASCII (UTF-8) character tables for values
# so called lexicographical ordering
'VALDIS' < 'VOLDEMARS'

In [None]:
ord("V")

In [None]:
ord("A")

In [None]:
ord("O")

In [None]:
"UL" in "ULDIS"

In [None]:
"Ul" in "ULDIS"

In [None]:
"text" in ["This", "is", "text"]

In [None]:
# combining multiple boolean values

True and True

In [None]:
True or False

In [None]:
not True

## If Statement

In [None]:
## Conditional execution lets the program choose 
## which commands to execute (e.g. based on the value of a comparison operation)

# if 4 is larger than 5 then do something
if 4 > 5:
    print("4 is larger than 5, wow!")
    print("Another command inside the if statement")

# now I am out of the "if" command
print("Always prints")

In [None]:
if 5 >= 5:
    print("hello")

if 5 == 6:
    print("hello that's magic")

if 5 != 6:
    print("hello that's not magic")

In [None]:
# What's wrong with this code?
print('hello that's magic')

In [None]:
"Valdis" != "Uldis" # we can check for inequality

In [None]:
# same as
not "Valdis" == "Uldis"

In [None]:
# the space in front of some code fragment below the "if" command
# means that it is a code block to be executed when the boolean
# value is true

if 2*2 == 4:
    print("Do one thing if if is True")
    print("Do more things if if is True")

    # we can keep adding more things here
    print("Also do this when if is True")

print("Do this always")

In [None]:
a = 4

# if and else here are part of the same if / else statement
if a > 5:
    print('a is larger than 5')
    # do more stuff
else:
    print('a is NOT larger than 5')
    # do more stuff if a is not larger than 5

In [None]:
# elif comes from "else if"

x = int(input("Enter an integer please! "))

if x > 42:
    print("Too ambitious an answer!")
elif x < 42:
    print("You dream too little!")
else: # only 42 remains here
    print("That is the answer to everything!")

#These lines below will execute always
print('Your number is', x)

In [None]:
# converting a string to a decimal number using float()
c = float(input("Enter temperature in Celsius "))

f = c * 9/5 + 32

print("Fahrenheit Temperature is", f, round(f, 2)) # we can round numbers up, instead of 2 you can use the required number of precision

if f > 100:
    print("You are too hot, find a doctor?")

In [None]:
# Try reversing the above program to create a Fahrenheit to Celsius converter

## Loops

In [None]:
# How would we perform the same/similar action multiple times?

In [None]:
# A "while" loop instructs the computer to repeat a code block
# while the loop condition (e.g. i < 5) is True

i = 0
print("Alice did")

while i < 5:
    print("talk")
    print("  i is ", i)
    i += 1   # same as i = i + 1

In [None]:
i

In [None]:
# What would happen if we did not have i+=1 in our above program ?

In [None]:
# A "for" loop instructs the computer to repeat a code block
# for every element in the given sequence (e.g. a list of numbers)

for x in range(10): # range is a number "factory"
    print("Running the cycle.")
    print(x)

In [None]:
# we can loop / iterate through strings, too
for c in "Uldis":
    print(c)

#### Iterating (looping) through lists

In [None]:
# we can iterate through list elements, too
food = ["apples", "carrots", "oranges"]

# for every element in the "food" list
for item in food:
    print(item)

In [None]:
shopping_list.append("Chocolate")
shopping_list.append("Strawberries")
                     
for need_to_buy in shopping_list:
    print("Need to buy", need_to_buy)
    # here we could actually do the buying operation

In [None]:
mylist = list(range(11,21))

for item in mylist:
    print("Item", item)

In [None]:
mydict = {"country": "Latvia", "capital": "Riga"}

mydict

In [None]:
# loop through a dictionary
for key, value in mydict.items(): # you could call them key, value or anything else
    print(key, ":", value)

### Your turn!

In [None]:
# Write a program that counts the number of different
# words in a text string and displays the top 10 words
# and the number of times they appear in the text.

text = """
The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language. It supports classification, tokenization, stemming, tagging, parsing, and semantic reasoning functionalities.[4] It was developed by Steven Bird and Edward Loper in the Department of Computer and Information Science at the University of Pennsylvania.[5] NLTK includes graphical demonstrations and sample data. It is accompanied by a book that explains the underlying concepts behind the language processing tasks supported by the toolkit,[6] plus a cookbook.[7]
NLTK is intended to support research and teaching in NLP or closely related areas, including empirical linguistics, cognitive science, artificial intelligence, information retrieval, and machine learning.[8] NLTK has been used successfully as a teaching tool, as an individual study tool, and as a platform for prototyping and building research systems. There are 32 universities in the US and 25 countries using NLTK in their courses.
"""

# 1) split the text into words

# write code here

# 2) use a dictionary to count the number of unique words

results = {}  # empty dictionary

# write code here

# 3) display top 10 results

# write code here

## What is a function? <a class="anchor" id="functions">

* A function is a block of organized, reusable code that is used to perform a single, related action.
###  DRY - Do not Repeat Yourself principle
* Every piece of knowledge must have a single, unambiguous, authoritative representation within a system. http://wiki.c2.com/?DontRepeatYourself

* Contrast WET - We Enjoy Typing, Write Everything Twice, Waste Everyone's Time

In [None]:
# Here we define our first function
# it does something (prints text)

def myFirstFunc():
    print("Running My first functio")
    print("Do more stuff")

In [None]:
# function has to be defined before it is called
myFirstFunc()

In [None]:
# we can call it repeatedly

myFirstFunc()
myFirstFunc()
myFirstFunc()

In [None]:
# Passing parameters (arguments)

def printName(name):
    print(f"Maybe my name is: {name}")


In [None]:
printName("Uldis")

In [None]:
printName("Sergii")

In [None]:
course_name = "Introduction to Python"

# you can also use variables as function arguments
printName(course_name)

In [None]:
def alternate_letters(text):

    new_string = ""
    for i, c in enumerate(text):
        if i % 2 == 0:
            new_string += c.lower() # this is same as saying new_string = new_string + c
        else:
            new_string += c.upper()  # this += is fine for smaller strings

    # print(new_string)

    return new_string # with return I can use the results of this function not just output to screen

In [None]:
alternate_letters("Coffee")

In [None]:
my_result = alternate_letters("Kefir")

In [None]:
my_result

In [None]:
# It is good practice to describe what the function does
# We can make Docstrings with '''Helpful function description inside'''

def mult(a, b):
    '''Returns multiple from first two arguments'''

    print("Look ma I am multiplying!", a, b, a*b)
    return a * b

In [None]:
help(mult)

In [None]:
mult(5, 3)

In [None]:
def printnum(num):
    if num > 10:
        print(f"This number {num} is too unwieldy for me to print")
    else:
        print(f"This {num} is a nice number")

printnum(8)
printnum(11)

In [None]:
# Convert your program that counts the number of different
# words into a function that takes the input text argument
# and prints the top 10 words and the number of times
# they appear in the text.

In [None]:
def top_10_words(input_text):

    # clean the text

    # split text into tokens

    results = {}  # empty dictionary

    # count the word frequency
    
    # print the results


In [None]:
text = """
The quick brown fox jumps over the lazy dog!
Let's repeat some words like fox lazy, more words: brown fox
"""

top_10_words(text)

## Libraries <a class="anchor" id="libraries">


In [None]:
# Python and Batteries Included Philosophy
## Why reinvent the wheel?

In [None]:
import math


In [None]:
# notice the . syntax helper
math.cos(3.14)

In [None]:
math.pi

In [None]:
math.cos(math.pi)

In [None]:
import random   # random number generator library
# https://docs.python.org/3/library/random.html


In [None]:
# generate random numbers from 1 to 6

result = random.randint(1,6)
print(result)

In [None]:
# print 10 random numbers

for i in range(10):
    print(random.randint(1,6))


In [None]:
help(random.randint)

In [None]:
from collections import Counter

In [None]:
magic = "abracadabra"

cnt = Counter(magic)

cnt

In [None]:
cnt.most_common(5)

In [None]:
for key, value in cnt.most_common(5):
    print(key, ":", value)

In [None]:
cnt = Counter(text.lower().split())

In [None]:
cnt.most_common(5)

In [None]:
# There are thousands of useful Python libraries
## Crucial libraries are collected in the Python Standard Library

# https://docs.python.org/3/library/

# "Batteries included"

# There are many more libraries that can be installed separately. Many of them
# are already installed in the Google Colab or Anaconda Python environment.

# If you need to use a library that is not yet installed, you can 
# install it using the "pip install" command (in this case, we are
# installing the "requests" library):

#!pip install requests

# https://requests.readthedocs.io/en/latest/

#### Your task - write a program for a number guessing game

* generate a random number 1..100 for a user to guess
* let a user input their guesses and display relevant messages:
  * too low
  * you guessed right!
  * too high
  
Limit the number of times the user may guess to 6.

In [None]:
# 1) Generate a random number

...

# 2) Cycle for 6 times

for i in range(6):

    # ask user for input
    ...
    
    # print a message depending on the value entered by user
        
    # if guessed right, stop the cycle (use the "break" command)


## Working with files
   
* reading files
* writing files
* folders

In [None]:
from pathlib import Path

### Using a local computer filesystem

In [None]:
# listing contents of a folder

# . is the current folder (e.g. the folder you ran Jupyter notebooks from)
folder = "."
my_path = Path(folder)

list(my_path.iterdir())

In [None]:
# let's use "sorted" to get a sorted list
for item in sorted(my_path.iterdir()):
    print(item)

### Google Colab note

In Google Colab we can not directly access local files.

Use Google Drive instead:
* mount Google Drive
* upload and access files on [Google Drive](https://drive.google.com/)
  * e.g. in the "BSSDH" subfolder

Colab will ask for permission to access your files on Google Drive in order to work with the files you have uploaded.

*After you are finished working in Google Colab you may want to remove its permission to access Google Drive files.*

In [None]:
# Set this to False if not using Google Colab
using_colab = True

if using_colab:
    # using Colab
    
    from google.colab import drive
    drive.mount('/content/drive')
    
    # "BSSDH" subfolder inside Google Drive
    drive_path = Path('/content/drive/MyDrive/BSSDH')

    # create the folder if it does not exist
    drive_path.mkdir(exist_ok=True)

    for item in drive_path.iterdir():
        print(item)

    # set my_path to the Google Drive path
    # (instead of a local directory)
    my_path = drive_path

else: 
    # not using Colab

    my_path = Path(".")

### Reading and writing files

In [None]:
# https://www.gutenberg.org/cache/epub/11/pg11.txt

file_path = my_path / "alice_wonderland.txt"
file_path



In [None]:
# reading a file

# we use the "with" statement + "open" function to open a file for reading
# and assign this open file to the "input_file" variable

with open(file_path) as input_file:

    # do something with the file
    
    # "text" will have the full contents of the input file
    text = input_file.read()

# we need to close open files when we have finished working with them.
# when using the "with" statement the file will be closed automatically.

print(text[:200])

In [None]:
# writing a file

my_text = """Text to be written
to a file. Let's write
multiple lines
"""

# "w" means open a file for writing
# it is useful to specify character encoding, too
with open(my_path / "new_file.txt", "w", encoding="utf8") as output_file:
    output_file.write(my_text)

list(sorted(my_path.iterdir()))

In [None]:
# let's filter a file to remove Project Gutenberg header
# and footer before analyzing its text

with open(file_path) as input_file:
    with open(my_path / "output.txt", "w", encoding="utf8") as output_file:

        # this will signal Python if the current line needs
        # to be written to the output file
        must_write = False

        # loop through every line in the input file
        for line in input_file:

            if line.startswith("*** END"):
                must_write = False   # this should be False!

            if must_write:
                output_file.write(line)

            if line.startswith("*** START"):
                must_write = True

In [None]:
file_path

In [None]:
with open(my_path / "output.txt", encoding="utf8") as input_file:

    # "text" will have the full contents of the input file
    text = input_file.read()

print(text[:190])

#### Your task - count the word frequency in a file


* choose or download the file to analyze
* put the file in the folder you can access from Python
  * e.g. current folder if using a local Python installation
  * Google Drive folder if using Google Colab
* read the file
* count and display word frequency for the most frequent words

(Optional) write word frequency data to a new file

In [None]:
# Hint: use the Counter class from "collections" library

...


## Most important Python ideas <a class="anchor" id="python-ideas">

* dir(myobject) to find what can be done (most decent text editors/IDEs will offer autocompletion and hints though)
* help(myobject) general help
* type(myobject) what type it is

### Slicing Syntax for sequences(strings,lists and more)
```
myname[start:end:step]
myname[:5]
```

### : indicates a new indentation level

```
if x > 5:
     print("Do Work when x > 5")
print("Always Do this")
```

# Python Resources <a class="anchor" id="learning-resources">


## Wiki for Tutorials

https://wiki.python.org/moin/BeginnersGuide/NonProgrammers

## Tutorials Begginner to Intermediate




* https://automatetheboringstuff.com/ - Anything by Al Sweigart is great
* http://newcoder.io/tutorials/ - 5 sets of practical tutorials
* [Think Like a Computer Scientist](http://interactivepython.org/runestone/static/thinkcspy/index.html) full tutorial
* [Non-Programmers Tutorial for Python 3](https://en.wikibooks.org/wiki/Non-Programmer%27s_Tutorial_for_Python_3) quite good for wikibooks
* [Real Python](https://realpython.com/) Python Tutorials for all levels


* [Learn Python 3 the Hard Way](https://learnpythonthehardway.org/python3/intro.html) controversial author but very exhaustive, some like this approach

## More Advanced Python Specific Books

* [Python Cookbook](https://www.amazon.com/Python-Cookbook-Third-David-Beazley/dp/1449340377) Recipes for specific situations

* [Effective Python](https://effectivepython.com/) best practices
* [Fluent Python](http://shop.oreilly.com/product/0636920032519.do) **highly recommended**, shows Python's advantages

## General Best Practices Books
#### (not Python specific)

* [Code Complete 2](https://www.goodreads.com/book/show/4845.Code_Complete) - Fantastic best practices
* [The Mythical Man-Month](https://en.wikipedia.org/wiki/The_Mythical_Man-Month) - No silver bullet even after 40 years.
* [The Pragmatic Programmer](https://www.amazon.com/Pragmatic-Programmer-Journeyman-Master/dp/020161622X) - More practical advice
* [Clean Code](https://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882) - more towards agile

## Blogs / Personalities / forums

* [Dan Bader](https://dbader.org/)
* [Reddit Python](https://www.reddit.com/r/python)

## Exercises/Challenges
* http://www.pythonchallenge.com/ - first one is easy but after that...
* [Advent of Code](https://adventofcode.com/) - yearly programming challenges
* https://projecteuler.net/ - gets very mathematical but first problems are great for testing

## Explore Public Notebooks on Github
 Download them and try them out for yourself

https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks

## Questions / Suggestions ?

Pull requests welcome

e-mail **uldis.bojars at gmail.com**