# IT Skills for linguists 2
## UAM, Faculty of English, 2BA
### Topic: *Subroutines and modules*
#### Poznań, 19.12.2022
#### Teacher: mgr inż. Michał Junczyk


# Introduction
- Problem: As our programs get larger...
  - Managing the code structure becomes harder
    - *"Indentation hell"*
    - Difficult to split work in the project
  - Repeated code in multiple places
    - Changes become harder (syncing)
- Solution: *Factoring*
  - **Factoring** - breaking code into more efficient and conceptually more reasonable chunks.
  - Achieved with **functions** and **modules**. 

# Agenda

- Simple functions
- Functions that return values
- Functions that take arguments 
- Recursive and lambda functions
- Modules
- Docstrings and comments

# 5.1 Simple functions
- Defined with:
  - **def** keyword
  - function name
  - parentheses
  - colon


In [41]:
def myfunction():
    print('This is a function')
    print("That's all it does")

In [42]:
myfunction()
#invoking the function

This is a function
That's all it does


## Repeated code example

In [None]:
#print a famous sentence over two lines
print('Colorless green ideas...')
print('...sleep furiously')
#get the user to enter a number
num = input('Enter a number: ')
#print the sentence again if it's < 5
if int(num) < 5:
    print('Colorless green ideas...')
    print('...sleep furiously')
else:
    print('Your number was big enough')

Colorless green ideas...
...sleep furiously


In [None]:
#a function to print that sentence
def myfunc():
    print('Colorless green ideas...')
    print('...sleep furiously')

#invoke the function
myfunc()

#collect the number
num = input('Enter a number: ')
#check if the number is < 5
if int(num) < 5:
    myfunc() #print sentence again if so
else:
    print('Your number was big enough')

- factored out repeated section as a separate function.
- function called twice in the following code.

In [None]:
def myfunc():
    #a function
    #user supplies a word
    word = input('Word: ')
    #print that word
    print('This is your word:',word)
    if len(word) > 5: #check if > 5
        print('Your word was long.')
    else:
        print('Your word was short.')

myfunc()

In [None]:
#this doesn't work!
def myfunc():
    word = input('Word: ')
    print('This is your word:',word)

myfunc()

if len(word) < 5:
    print("Your word wasn't long enough")

- Variable *word* is available only **inside the function**
- Refering to *local* variable outside its function causes error

In [None]:
word = input('Word: ') #user supplies word

#function refers to previous value!
def myfunc():
    print('This is your word:',word)

myfunc()

#check if word is less than 5 letters
if len(word) < 5:
    print("Your word wasn't long enough")


- value of *word* is set outside the function.
- function has access to *global variable* word
- bad idea - refering to external (global) variables in the function

# 5.2 Functions That Return Values

- syntax - keyword return preceeds the returned value


In [None]:
def myfunc():
    #function definition
    print("This prints.") #prints this
    return 6
    #return the value 6
    #gratuitous print command
    print("This doesn't print!")

#invoke function, assign value to x
x = myfunc()
#print value of x
print("Here's the function output:", x)

- string *"This prints."* is not the value returned by *myfunc()*
- The value returned appears after the return statement: *6*
- first *print()* statement executes, but the second does not
- Statements after *return* cannot run

In [None]:
def sillyfunc():
    #function definition
    #user supplies a word
    wd = input('Type a word: ')
    
    if len(wd) > 4: #check length of word
        #return length and exit function
        return len(wd)
    else:
        #otherwise...
        print('The word is too short!')

#save value of function
res = sillyfunc()

#print value of variable res
print('The result: ',res)

res = sillyfunc()
print('The result: ',res)


- Functions can return more than one value
- Several values separated by commas after the return

In [None]:
def myfunc():
    #function definition
    #collect two strings
    x = input('First string: ')
    y = input('Second string: ')
    z = x + ' ' + y #concatenate strings
    #return all three
    return len(x),len(y),z

#invoke function saving all
a,b,c = myfunc()
#3 return values
print('a =',a)
#print the 3 values
print('b =',b)
print('c =',c)

- function prompts for two strings 
- function returns three values
  - length of the first string
  - length of the second string
  - concatenation of the two strings

# 5.3 Functions That Take Arguments

- superficially simple to do, but can get tricky
- function arguments:
  - variable names to be used by function

- enlisted in the parentheses in the function definition.

In [None]:
#function that takes 2 arguments
def myfunc(a,b):
    #return the concatenation
    #OR addition of those values
    return a + b

#invoke the function with numbers
print(myfunc(3,10))
#invoke the function with strings
print(myfunc('strings ','too'))

- two arguments, a and b. 
- functions applies **addition operator +** 
  - integer arguments -> addition of the arguments
  - string arguments -> concatenation of those arguments
- Python does not restrict type of arguments!

### Functions and mutability

In [None]:
x = 'a value'
def anotherfunc(a):
    a = 'another value!'
    return a
print(x)
print(anotherfunc(x))
print(x)

- strings are **immutable**
- inside the function a new value is assigned to **a**
- old value still attached to **x** -> not collected as *garbage*
- after function original value of **x** is printed

In [None]:
x = [4,5,6]
def anotherfunc(a):
    a.append(7)
    return a

print(x)
print(anotherfunc(x))
print(x)

- Value of **x** changes after the function applies!
- Reason - lists are mutable.
- Function forces **a** to refer to the same list as **x**.

### Ways of invocing functions

In [None]:
#function definition
def thefunction(x,y):
    return x + ' ' + y

#invoke the function 3 ways
print(thefunction('one','way'))

In [None]:
print(thefunction(x='another',y='way'))
print(thefunction(y='way',x='yet another'))

- two arguments function
- invoked in three different ways
  - providing two arguments (specific order)
  - naming the variables (any order)

In [None]:
#function with default for 2nd arg
def f(x,y='oops'):
    return x + ' ' + y

#invoked 3 ways

print(f('hat'))
print(f(x='chair'))
print(f('hat','chair'))

In [None]:
#function with unspecified
#number of unnamed and named arguments
def func(*args,**kwargs):
    for a in args:
        #print unnamed args
        print(a)
    for k in kwargs: #print named args
        print(k,'\t',kwargs[k])

#invoked with unnamed FOLLOWED by named arguments
func(3,6,8,hat='wow',chair=3.5)

# 5.4 Recursive and Lambda Functions

In [None]:
#function with 2 args:
#a function f
# and something else x
def func(f,x):
    return f(x)

print(func(len,'hat'))

In [None]:
def func(x):
    if x == 'L':
        return len
    else:
        return type
#invoking the function returns a
#function which we apply to
#'chair'. This may look confusing....
print(func('L')('chair'))
print(func('A')('chair'))

- Recursive functions can manipulate themselves
- e.g. function for calculating factorials

In [None]:
def fac(n):
    #function definition
    if n == 1:
        #base case of recursion
        return 1
    else:
        #recursive clause
        #invokes the function ITSELF
        print(n)
        return (n * fac(n-1))

#invoked with base case
print('1! =',fac(1))

#invoked with recursive case
print('2! =',fac(5))

- recursive function fac(). 
- different return depending on the argument:
  - if arguement is 1, then 1 is returned
  - Otherwise value times the result of applying fac() to the next lower integer.
- Walk through example:<br>
Step 1: fac(3)<br>
Step 2: 3 * fac(2)<br>
Step 3: 3 * 2 * fac(1)<br>
Step 4: 3 * 2 * 1 = 6<br>

### Lambda functions

- **Lambda** functions are anonymous functions
- Keyword **def** and the **function name** are replaced with the keyword **lambda**
- Can be used by putting parentheses with arguments after it.

In [None]:
print((lambda x,y,z: x + y + z)('hat', 'test', 'test2'))

In [None]:
def makeAddN(n):
#function definition
#returns new function
    return lambda x: x + n

#invoke twice, making 2 new functions
add2 = makeAddN(2)
add6 = makeAddN(6)
#apply those two new functions
print(add2(17))
print(add6(17))

# 5.5 Modules

- Python has various objects, functions, and methods
- Some require appropriate import statement <br>
  - *e.g. import sys* to get command line arguments via *sys.argv* variable
  - *e.g. randint()* function requires importing **random** module
- The most frequently used are always available
- Those needed only in specific situations requires import of specialize module
- The alternative would be a disaster - thousands of functions being available at once!
- How to distinguish everything from each other? We would need rather long names:)

## This section covers
- finding out what modules are available to you
- getting help on any of them.
- importing modules. 
- writing your own modules.


In [None]:
# Find out what modules are installed on your system
help('modules')
#This will generate a list of every module installed  
# + Python programs in the current directory

In [None]:
#To find out more about module import it then use help()
import re
help(re)

In [None]:
from sys import argv
print(sys.argv)

In [None]:
#importing whole module
import sys
print(sys.argv)

In [None]:
# importing only specific elements
from sys import argv
print(argv)

In [None]:
# You can import everything from a module:
# e.g. from sys import * 
# Bad idea! It makes everything in the module available 
# Can lead to unintended name conflicts.

In [None]:
#To create alias for the module in the import statement.
import sys as s
print(s.argv)

import pandas as pd
import numpy as np

In [None]:
# Aliases allows using a different module prefix for any function or object imported
# Different import options allow keeping program name space as uncluttered as possible
# It's important to restrict what elements are available and control naming to avoid conflicts

# 5.6 Writing your own modules

In [None]:
#our own module (saved as file func25.py)

myVar = 'hats and lemons'  #variable

def myFunc(s):             #function
	return len(s)


In [None]:
import func25
#import function
#invoke variable with full name
print(func25.myVar)
#invoke both with full names
print(func25.myFunc(func25.myVar))

In [None]:
#invoke everything from the module
from func25 import *
#invoke variable without prefix
print(myVar)
#invoke both without prefixes
print(myFunc(myVar))

In [None]:
#Finally, we can, as above, import the module with a different name:
#import with abbreviated prefix
import func25 as f
#invoke function with f prefix
print(f.myVar)
#invoke both with f prefix
print(f.myFunc(f.myVar))

### Imported only and runnable modules

- We can write modules that can be **imported** or be **run on their own**
    - Imported - provide functions and variables that other programs can use
    - Runned on their own - use those functions and variables themselves.

- The **__name__** variable can be used for controlling module behaviour
- The **__name__ **  is automatically set to '__main__' when program is loaded in directly


In [None]:
#module that can run on its own
myVar = 'hats and lemons'#variable
def myFunc(s):
    return len(s)#function
#if this is loaded on its own...
if __name__ == '__main__':
    print(myFunc(myVar))

# This program will print out the length of myVar when it is invoked on its own.

## Introducing special comments - docstrings

- So far comments in programs were used as:
  - notes to ourselves
  - notes to other programmers who might inherit our code
- For **modules** we need to describe the functions and variables our module provides.
- **docstrings** - best way to document our functions for other programmers 
- **docstrings** enable others to use **help()** function to find out more about any function


- **docstring** is a triple-quoted string within a function right after the def line. 
- The triple-quoted string includes two bits of information: 
  - an intuitive statement of what the function does 
  - an explanation of what the argument is

In [None]:
# example func30.py
def myLen(s):
	'''This computes the length of a string.
	s -- the string
	'''
	return len(s)


In [None]:
import func30
help(func30.myLen)

- Docstrings - are used to make the functions of your modules usable by others.
- Comments- used for you or others to understand or alter your code.
- You should make use of both in modules, but in different ways

## Protecting module functions and variables

- Modules can make functions and variables available for other programs.
- Docstrings helps to describe their purpose and usage method
- Some module functions or variables should be protected (not be available for other programs)

- Example purpose of the module:
  - reads in a file
  - returns a list of all words with an even number of letters
  - returns how frequent each of those words is.

- Module may provide function like evenCount(), which:
  - takes a filename as an argument 
  - returns a Python dictionary
    - key is a word with an even number of letters 
    - value is the frequency of that word.

- *evenCount()* function to work may require “helper” functions
- These would be part of module and be called by the *evenCount()* function
- Other programmers **do not need or should have direct access** to them
- **Take-away - private functions should always have names that begin with _.**

In [None]:
# func32.py 
# example of module with private 
def myF(s):
    #this uses _mySplit()
    '''This calculates the number of
    words in a string minus one.
    s -- the string
    '''
    wds = _mySplit(s)
    return len(wds)

def _mySplit(s):
    #this is private!
    '''This returns all the words in
    a string except the first.
    This docstring is pointless!
    '''
    ws = s.split()
    return ws[1:]

In [None]:
from func32 import *
#doesn't work!
print(_mySplit("This doesn't work"))

In [None]:
import func32
print(func32._mySplit("Oh, this does work"))

In [None]:
from func32 import _mySplit
print(_mySplit("Oh, this works too"))

- Effectively, amount of privacy granted fron using the underscore prefix is limited
- It's possile to reach into a module and make use of something with an underscore prefix. 
- Yet, Python programmers assume that use of underscore in name indicates private use
- Using underscores help programmers to understand the logic and intent of your code.

# 5.7 Docstrings and Modules

- Modules are documented with docstrings as well
  - triple-quoted strings that appear at the beginning of a module
  - first line of a function
  

In [None]:
'''This is a test module.
Author:
Jan Kowalski
Version: 10 (11/18)
'''
def f(x):
    '''This function doubles its argument.
    Args:
    x: a number to double
    Returns:
    x*2
    '''
    y = x * 2
    return y

- The first few lines of the file document the module.
  - Description of what the module does
  - Good programming practice - author’s name
  - Indication of what version of the software it is (perhaps including the date)
- The first line of a function can also be a docstring. 
  - function description
  - explain what arguments function takes and what values are returned.

- Docstrings are instances of public documentation. 
- Can be displayed via the help() function whenever the module is loaded.
- Private functions (with a prefixed underscore) having docstrings will not be displayed with help() call.