# More about Functions


## Functions with variable arguments.

* A standard python function takes only as many argument as the number of parameters

* If function takes 3 parameter
    * you can pass 3 argument

* Any attempt to pass more values will result in error.

#### Sometimes we don't know how many values we have to pass

* Example #1 How many arguments print() takes?
    * print(1)
    * print(1,2,3,4,5)

* Example #2 We want to sum a given sequence of numbers
    * sum(1,2)
    * sum(1,2,3)
    * sum(1,2,3,4,9,4,9,1)


* How do we define a function that can take different number of arguments depending on requirement?


#### Approach #1 The function make take sequence

In [1]:
def sum( values ):
    total = 0
    for value in values:
        total += value
    return total

##### Now we can pass values by wrapping them in some sequence

In [2]:
sum([1,2,3,4])

10

In [3]:
sum((1,2,3,4,5,6,7,8,9,10))

55

#### Problem Solved. Right?
* It works.
* It looks ugly
    * two wrappers
* print() doesn't require that


### Approach#2 python variable argument syntax

* python allows you to officially pass variable arguments to a function
* they are internally collected and stored in a tuple
* to do this we prefix parameter name with a **\***



In [5]:
def call_me(*args):
    print(type(args),args)

In [7]:
call_me(1,2,3,4,5)
call_me(1,2,3)
call_me(1)
call_me()

<class 'tuple'> (1, 2, 3, 4, 5)
<class 'tuple'> (1, 2, 3)
<class 'tuple'> (1,)
<class 'tuple'> ()


#### sum function revisited
* note
    * nothing changes other than parameter with **\***



In [8]:
def sum( *values ):
    total = 0
    for value in values:
        total += value
    return total

In [9]:
print(sum(1,2,3,4))
print(sum(1,2,3,4,5))
print(sum())

10
15
0


#### Assugment 2.2  reuse the given sum() function to create an average function to average a given number of items

* your average should internally call sum()

In [10]:
def average(*values):
    return sum(values)/len(values)

In [11]:
average(1,2,3,4)

TypeError: unsupported operand type(s) for +=: 'int' and 'tuple'

In [12]:
def sum( *values ):
    total = 0
    for value in values:
        total += value
    return total


def average(*values):

    return sum(*values)/len(values)



In [13]:
average(1,2,3,4)

2.5

#### Assignment 2.3

* write a function **frequency** that can calculate frequency distribution of a given set of values

* the values may be passed using a sequence or using variable args.

```python

f1 = frequency(1,1,2,2,1,1,1,1,9,2,9,6,4,1,1,9) # {1:8, 2:3, 9:3, 4:1, 6:1}

f2= frequency([1,2,3,3,2,3,2,4,5]) #{ 1:1, 2:3, 3:3, 4:1, 5:1}

```


* now create a function called **plot_histogram** that can plot a histogram for a given frequency data


```python
plot_histogram({2:4, 3:7, 1:1:, 4:2, 5:1, 6:3})
```

<pre>
 2 | ===== ===== ===== ===== 4
 3 | ===== ===== ===== ===== ===== ===== ==== ===== ===== 7
 1 | ===== 1
 4 | ===== ===== 2
 5 | ===== 1
 6 | ===== ===== ===== 3 
</pre>


* Allow options to customize
    * bar design
    * to show the frequency values or not (by default True)
    * align frequency values (by default False)


* example: Aligned frequency with bar made up of Pipes

<pre>
 2 | ||||| ||||| ||||| |||||                    4
 3 | ||||| ||||| ||||| ||||| ||||| ||||| |||||  7
 1 | |||||                                      1
 4 | ||||| |||||                                2
 5 | |||||                                      1
 6 | ||||| ||||| |||||                          3 
</pre>
 

* example2: frequency with bar made up of Pipes and label hidden

<pre>
 2 | ||||| ||||| ||||| |||||                    
 3 | ||||| ||||| ||||| ||||| ||||| ||||| |||||  
 1 | |||||                                      
 4 | ||||| |||||                                
 5 | |||||                                      
 6 | ||||| ||||| |||||                           
</pre>

##### frequency v1

In [40]:
def frequency(values):
    
    print(values)
    data={}

    for value in values:
        data[value]=values.count(value)

    return data

In [41]:
data = [2,2,1,9,1,1,1,1,1,4,5,4,7,1,1,1,2,9,9,1,1,1,2]

In [42]:
f1= frequency(1,2,1,1,4,5,7)

print(f1)

f2= frequency(data)

print(f2)

TypeError: frequency() takes 1 positional argument but 7 were given

### How to make a function accept both variable arguments or a sequence


### frequency v2

In [46]:
def frequency(*values):
    
    sequences = ("list","tuple")
    if len(values)==1 and type(values[0]).__name__ in sequences:
        values = values[0]      


    data={}
    for value in values:
        data[value]=values.count(value)

    return data

In [47]:
f1= frequency(1,2,1,1,4,5,7) #passing variable argument

print(f1)

f2= frequency(data) # passing list

print(f2)

d3=(2,3,9,9,2,4,7,2,1,1,1)

f3= frequency(d3) #passing tuple
print(f3)

{1: 3, 2: 1, 4: 1, 5: 1, 7: 1}
{2: 4, 1: 12, 9: 3, 4: 2, 5: 1, 7: 1}
{2: 3, 3: 1, 9: 2, 4: 1, 7: 1, 1: 3}


### Problem with current frequency algorithm.


* It has a complexity of n^2
    * It loops through all items (n)
        * this is essential
    * For each item it calls count
        * count loops through whole list again (n)
    * for same item with frequnency "x"
        * result can be calcuated using a single count call
            * same result will be recaculated for next (x-1)*n calls
                * waste.



#

In [52]:
def count(values, test):
    _count=0
    
    for value in values:
        if value==test:
            _count+=1
    print(f' {test} => {_count}')
    return _count

def frequency(*values):
    
    sequences = ("list","tuple")
    if len(values)==1 and type(values[0]).__name__ in sequences:
        values = values[0]      


    data={}
    for value in values:
        data[value]=count(values,value)

    return data

In [54]:
d3=(2,1,2,2,2,2,1,1,1,2)
frequency(d3)

 2 => 6
 1 => 4
 2 => 6
 2 => 6
 2 => 6
 2 => 6
 1 => 4
 1 => 4
 1 => 4
 2 => 6


{2: 6, 1: 4}

#### Frquency V2 (Not the best logic)

* avoid recounting if item is already counted.

In [55]:
def frequency(*values):
    
    sequences = ("list","tuple")
    if len(values)==1 and type(values[0]).__name__ in sequences:
        values = values[0]      


    data={}
    for value in values:
        if value not in data.keys(): # if this value is not already counted
            data[value]=count(values,value)

    return data

In [56]:
frequency(d3)

 2 => 6
 1 => 4


{2: 6, 1: 4}

### What is the complexity of this code

* O((n+1)*k)
    * where n is total number of items

    * +1  comes from loop inside the frequency

    * it runs count() only once per key
        * each count runs for n*k

```python
 data={}
    for value in values:
        if value not in data.keys(): # if this value is not already counted
            data[value]=count(values,value)

    return data
```

* here we have accounted for 
    * for loop inside frequency
    * count()

* what about "in" call.
    * is "in" call also a loop?

### does "in" function runs a loop to check if a value is present in the list or set or dict?

* In case of a list/tuple "in" function runs a loop.
* set/dict using hashing mechanism which is extremly fast and doesn't require loop
    * they are said to have o(1) complexity

* But even if it was not that fast, there would be only few keys, so loop time will negligible




### Frequency v3

* we still have room for improvement



In [65]:
def normalize_arguments(values):
    sequences = ("list","tuple")
    if len(values)==1 and type(values[0]).__name__ in sequences:
        values = values[0]   

    return values

def frequency(*values):
    
    values=normalize_arguments(values)   

    data={}
    for value in values:        
        data[value]=data.get(value,0)+1

    return data

In [67]:
print(data)
frequency(data) #{2:2,1:3,9:1,}

[2, 2, 1, 9, 1, 1, 1, 1, 1, 4, 5, 4, 7, 1, 1, 1, 2, 9, 9, 1, 1, 1, 2]


{2: 4, 1: 12, 9: 3, 4: 2, 5: 1, 7: 1}

### Performance
* This code loops through entire loop exactly once making the compexity O(n)
* get(n, 0)  also need to search for the key
    * but key search is extremely fast with O(1) complexity
    * if it were like a list it would have added a complexit of O(k) making algorithm O(n*k)


* The above code can also be written (with same performance) but in an expanded way as below

In [68]:
def frequency(*values):
    
    values=normalize_arguments(values)   

    data={}
    for value in values:        
        if value in data:
            data[value]+=1
        else:
            data[value]=1

    return data

### plot_histogram

```python
plot_histogram({2:4, 3:7, 1:1:, 4:2, 5:1, 6:3})
```

<pre>
 2 | ===== ===== ===== ===== 4
 3 | ===== ===== ===== ===== ===== ===== ==== ===== ===== 7
 1 | ===== 1
 4 | ===== ===== 2
 5 | ===== 1
 6 | ===== ===== ===== 3 

</pre>

In [69]:
def plot_histogram(frequency):
    design='=== '
    for value,frequency in frequency.items():
        print(f'{value}|{design*frequency} {frequency}')

In [71]:
plot_histogram({2:4, 3:7, 1:1, 4:2, 5:1, 6:3})

2|=== === === ===  4
3|=== === === === === === ===  7
1|===  1
4|=== ===  2
5|===  1
6|=== === ===  3


TODO: add the customization as per assignment 2.3