# Fundamentals of Data Analysis (FoDA) - Tasks 2020

These are the workbook for the tasks that I have created for the FoDA module. This workbook has been created by Sheldon D'Souza (G00387857@gmit.ie)

***

### Task for the week of October 5, 2020 - Writing a function 'counts'

The objective of Task is to write a Python function called 'counts' that takes a list as input and returns a dictionary of unique items in the list as keys and the number of times each item appears as values.

#### How the function works

- The function requires a list to be passed as an argument
- It will then ask the user for input on whether the user wants to count upper and lower case items as unique. A message will appear on the screen:
        
        *Do you want the  count to be case sensitive*
            *Enter '1' to treat all as upper case
            *Enter '2' to treat all as lower case
            *Enter '3' to count in original case

- Depending on the choice the user makes, the function will treat the unique items in the list as case sensitive or otherwise (e.g. a list containing 'A' and 'a' will give a count of A:2 or a:2 in case option 1 or 2 are selected respectively; and A:1 , a:1 when option 3 is selected). If the user inputs anything else except 1,2 or 3 the program gives an error message and terminates
- The function will return a dictionary of  unique items OR return an error message if the the  list has items which are outside the parameters (mentioned in the function limitations below) 


#### Function limitations

The function accepts a list subject to the following limitations:

- The list can contain any characters (alphaumeric, special characters etc.)
- The list can contain another list/tuples. However, that embedded list/tuples/dictionary cannot contain any further lists/tuples/dictionary. In other words ONLY one layer of embded lists or tuples or dictionaries is allowed
- The list can contain a dictionary. However, that dictionary cannot contain any further list, tuple or dictionary (see previous point). Furthermore, the embedded dictionary needs to contain only numeric values.
- The embedded lists, tuples and dictionaries will be added to the count of values in the main list

If the above limitations are exceeded the function will return and error and will terminate. 


#### Researching and planning the exercise

I had seen a similar exercise covered in 'Automate the Boring Stuff with Python' by Al Sweigart [1] which I used as a base for the exercise. 

I planned the assignment as follows:

- Researching and writing the core code which does the count
- adding in the code for the imbedded lists/tuples and dictionaries
- added code which will check whether the values for the embedded dictionaries are numeric
- added code to merge the embedded dictionary and the 'output' dictionary together
- adding in the code for the user input which in turn will set a flag for whether the charaters should be counted as case sensitive and added code based on each 'flag' scenario
- added try and except codes to return an exception and terminate the function in case the program exceeds the inbuilt limitations 



#### Writing the code

*Core code*

The core piece of code is not very complicated and mainly revolved around the use of a for loop to iterate over the list. [4].

The core code uses the 'setdefault' method of a dictionary object to create keys where they do not aready exist and uses a counter to keep track of the number of instances for a particuar key. A sample of this code is below

```python
DICT.setdefault(item, 0)
DICT[item] = count_dict[item] + 1
```

*Input code*

The input code uses the input statement to display a prompt to request the user for a choice of the numbers 1,2 or 3. The function then uses an if, elif and else statement to evaluate the user input and sets a varible to a certain state depending on the user input. The else statement terminates the function and returns an error message if the user inputs an incorrect choice. (See documentation above for details of the choices offered)


*The main code block*

The main code of the program is written as follows:
1. Two blank dictionaries are created. The first one is to hold the output of the function and the second is to hold the content of any embedded list passed as an argument (these lists will be merged in a later part of the code)
2. A for loop that iterates through the list and checks for the data type. 
3. The function uses the inbuilt type function within python to check for the data type of each element of the list passed as an argument 
4. A series of if, elif and else statements then run seperate codes depending on the data type
5. If the data type is an list or tuple (i.e. an embedded within the main list passed) the function will run another for loop to iterate over each element within that list or tuple and run the 'core code' as detailed in the 'core code' section above'    

I used a for loop to iterate over the list and then check for the data type. As mentioned above, I have written the program so that it will accept one list or a single dictionary within the orginal list. Anything more than that will cause a hashable error which will be handled by the 'try and except' part of the function and return an error to the user.


I used if statements to check the data type of each item within the list. Depending on the the data type, I modified the count function customised to the data type. 

I also made  the item count only lower case alpha characters (or may add an option, to ask the user to choose). 

If the datatype is a dictionary then the two dictionaries will merge. If the values within the dictionary are non numeric, the program will terminate and inform the user that the dictionary input is not valid. 

If more than one layer of lists or dictionary then will exit the function and return an error (non hashable error)





#### References
[1] 'Automate the Boring Stuff with Python' by Al Sweigart - Chapter 5 - Dictionaires and Structuring Data: The Dictionary Data Type

[2] https://thispointer.com/how-to-merge-two-or-more-dictionaries-in-python/

[3] https://www.askpython.com/python/examples/python-user-input

[4] A Whirlwind tour of Python by Jake VanderPlas - Chapter X 'For Loops'

In [1]:
#TODO - clean up variable names

def count(a_list):
    '''This function takes a list as an argument and returns a dictionary of unique items and the 
            number of times they appear in the list'''
    
    try: # used a try/except blocks to hand TypeErrors in case an invalid list is input
        
        
        #flag will ask for a user choice on whether the count for alpha characters will be case sensitive or not
        flag = input("Do you want the  count to be case sensitive. \
                     Enter '1' to treat all as upper case. \
                     Enter '2' to treat all as lower case. \
                     Enter '3' to count in original case")
        if str(flag) == '1':
            flag = 'upper'
        elif str(flag) == '2':
            flag = 'lower'
        elif str(flag) == '3':
            flag = ''
        else:
            return 'Invalid input. Please rerun program \
                    and make the correct case sensitivity choice' #Invalid choice terminates the program
        
        
        
        count_dict = {} # start with a blank dictionary
        tempdict = {} # a temporary dictionary for use below
        
        for item in a_list: #iterate through each item on the list and count it's elements based on the type of the item
            
            if type(item)==list or type(item)==tuple: #The program will allow one layer of list or tuples within the input list
                for inner_item in item:
                    count_dict.setdefault(inner_item, 0) #setdefault will generate a key if none exists and give it the value of zero
                    count_dict[inner_item] = count_dict[inner_item] + 1 #1 will be added to the 'value' of the key in the previous line
            
            elif type(item)==dict: #if a dictionary is passed within the list then this code will iterate though each 'value' of the inner dictionary to ensure that they are numbers
                for value in item.values():
                    if str(value).isnumeric():
                        tempdict = item #if all the values of the dictionary are numbers then the dictionary is valid and it is stored to tempdict variable for further processing
                    else:
                        return 'Error, One of more values of the input dictionary are not numeric. \
                                Please include a proper dictionary' #if the value of the inner dictionary are not numbers the program terminates
            
            #The next three blocks of code (2 elif and 1 else) checks whether the user wants the alpha characters to be upper
            # or lower case or original case and then counts them accordingly converting them where necessary.
            elif type(item)==str and flag == 'lower':
                item_lower = item.lower()
                count_dict.setdefault(item_lower, 0)
                count_dict[item_lower] = count_dict[item_lower] + 1

            elif type(item)==str and flag == 'upper':
                item_upper = item.upper()
                count_dict.setdefault(item_upper, 0)
                count_dict[item_upper] = count_dict[item_upper] + 1

            else:
                count_dict.setdefault(item, 0)
                count_dict[item] = count_dict[item] + 1

        
        #if there is a dictionary in the input list then the following code will merge the two dictionaires together        

        final_dict = {**count_dict, **tempdict} #code to merge the dictionaries into a new dictionary

        for key, value in final_dict.items(): #TODO -  add comments
            if key in count_dict and key in tempdict:
                   final_dict[key] = value + count_dict[key]

        return final_dict
    
    except TypeError:
        return "Error. You have included more than one list/tuple within a list/tuple or an invaid dictionary. \
                Please refer to Readme file to see acceptable input parameters"

In [2]:
count([[1,2,1,2,3],3,7,{1:7, 3:2},'a', 'A', 'a', 'A',(1,2,3)])

Do you want the  count to be case sensitive.                      Enter '1' to treat all as upper case.                      Enter '2' to treat all as lower case.                      Enter '3' to count in original case1


{1: 10, 2: 3, 3: 5, 7: 1, 'A': 4}