# CIS 41B: REVIEW

## The following is a review and reminder of good practices for efficient Python code.

---

### 1. What can be improved with this code?

In [None]:
scoresDict = {}
scoresDict = readData()  # readData() returns a filled dictionary

##### ANSWER:

redundant code, just need:
```python
scoresDict = readData()
```

---
### 2. Given the following code:

In [None]:
# this code reads each line of an input file until there's an
#  empty line, and stores fields 2 and 3 of the line as a
#  key/value pair in a dictionary

list = []
d = dict()
with open(filename) as infile:
   aline = infile.readline()
   while aline != "":  
      aline = aline.rstrip().split()    
      list.append(aline)
      aline = infile.readline()
    
for i in range(len(list)):
   d[lines[i][1]] = lines[i][2]

#### 2a. Is there any error with the code?

#### 2b. What is the field separator in the input file?  

.split() separates by ws by default

#### 2c. What can be improved with this code?

In [None]:
# fix arr name and use dictionary comprehension
lines = []
d = dict()
with open(filename) as infile:
    lines = [line.rstrip().split() for line in infile]
d = {line[1]: line[2] for line in lines}

---

### 3. This code builds a dictionary of first letters as keys, and matching country names in a set for the values. 
    
#### Example of a key/value pair:  'A': (Afghanistan, Albania, Algeria). 
#### The country names are keys in another dictionary called countriesDict.

In [None]:
str1="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
lettersDict = {}
for elem in str1:
   aSet = set()
   for country in countriesDict:    
      if country.startswith(elem) :
         aSet.add(country)
         lettersDict[elem] = aSet

In [1]:
# a) Is there anything wrong with the code?  
# b) What can be improved in the code?

str1 = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

# Define countriesDict as a dictionary that maps country names to some values
countriesDict = {"Afghanistan": {"capital": "Kabul", "population": 38928346},
                 "Albania": {"capital": "Tirana", "population": 2877797},
                 "Algeria": {"capital": "Algiers", "population": 43851044},
                 # add more countries...
                }

# Use dictionary comprehension to create a dictionary that maps each letter in str1 to a set of countries that start with that letter
lettersDict = {elem: {country for country in countriesDict if country.startswith(elem)} for elem in str1}

print(lettersDict)

{'A': {'Algeria', 'Afghanistan', 'Albania'}, 'B': set(), 'C': set(), 'D': set(), 'E': set(), 'F': set(), 'G': set(), 'H': set(), 'I': set(), 'J': set(), 'K': set(), 'L': set(), 'M': set(), 'N': set(), 'O': set(), 'P': set(), 'Q': set(), 'R': set(), 'S': set(), 'T': set(), 'U': set(), 'V': set(), 'W': set(), 'X': set(), 'Y': set(), 'Z': set()}


In [None]:
# Here is the in-class corrected code:
str1="ABCDEFGHIJKLMNOPQRSTUVWXYZ"   
lettersDict = {}
for elem in str1:                       # For each letter in str1:
    aSet = set()                        # 1. Create a new set (Recall: unique elements)
    for country in countriesDict:       # 2. For each country in countriesDict:
        if country.startswith(elem) :   # 3. If the country name starts with the letter:
            aSet.add(country)           # 4. Add the country to the set.
    lettersDict[elem] = aSet            # 5. Add the set of countries to the dictionary. 

# A problem: for every letter, ALL countries are checked - T(n) = O(M * N)

#-----------------------------------------------------------# 

#### Version 2: O(M + N)... I think. Correct me if I'm wrong pls.
# Use a lookup dictionary to avoid repeat checks, and loop over countriesDict in outer loop
# lookup dictionary maps the first letter of each country to a list of countries
lookupDict = {}
# Loop over countriesDict, getting the first letter 
for country in countriesDict:
    first_letter = country[0]
    # Check if the first letter is already in the lookup dictionary
    if first_letter in lookupDict:
        # If it is, append the country to the list of countries
        lookupDict[first_letter].append(country)
    else:
        # Otherwise, create a new list with the country as the first element
        lookupDict[first_letter] = [country]

# Create lettersDict: {'A': (C1, C2, c3, ...), 'B':....)} 
# Dictionary comprehension w/ the lookup dict to avoid repeat checks
lettersDict = {letter: set(lookupDict.get(letter, [])) for letter in str1}
print(lettersDict)

#-----------------------------------------------------------# 
%reset -f
str1="ABCDEFGHIJKLMNOPQRSTUVWXYZ"  
countriesDict = {"Afghanistan": {"capital": "Kabul", "population": 38928346},
                 "Albania": {"capital": "Tirana", "population": 2877797},
                 "Algeria": {"capital": "Algiers", "population": 43851044},
                 # add more countries...
                }
#### VERSION 3: If goal is sparse code rather than efficiency
# Can use dictionary comprehension to do the whole thing in one line
lettersDict = {letter: {country for country in countriesDict if country.startswith(letter)} for letter in str1}
print(lettersDict)

---

### 4. What's good about defining this global constant for a program that has a default input file?

In [None]:
DEFAULT_FILE = "input.txt"

___  


### 5. If the input file can be opened successfully, this code reads lines from the file into a list to be processed later.If the file can't be opened, the code prints an error message. Is there any error with the way the exception is handled?

In [None]:
dataList = []
try:
   with open(filename)as infile :
      for line in infile:
         dataList.append(line.rstrip())       
except IOError:
   print("Error opening " + filename)

# code to work with data in the list is here

---

### 6a. An input file is made of lines of floating point values, one number per line. The file can be opened successfully so no need to test for file open success. Write the most efficient code to create a list of floats from the input file.

In [None]:
# Solution:

######

---

### 7. Given that:

In [None]:
L = ["CIS 41A", "CIS 28", "EWRT 10", "PE 10"]

### Write the most efficient code to print one line of output, as a comma separated sequence of elements of L:

In [None]:
# Solution:

###

___  

### 6b. Write the most efficient code to print 4 lines of output in column format: