# Ex. 01 - Custom cryptography
Write a program that that offers both encoding and decoding functionalities over textual data.
The desired encoding algorithm operates by simple substitution, mapping one character to one another according to a given cipher.

For instance, take into account the following cipher. Any character not considered in the set of matches, is left unchanged by the algorithm.
Feel free to play with the cipher, but make sure matches are non-ambiguous: we need to be able to perform both encoding and decoding.

| 1 | 2 | 3 | 4 | 5 | a | e | i | o | u |
|---|---|---|---|---|---|---|---|---|---|
| 3 | 5 | 1 | 2 | 4 | e | i | u | a | o |

## Examples
- `hello 123` becomes `hilla 351`
- `900 minus 600 is 300` becomes `900 munos 600 is 100`

## Hints
- Consider which data structure, if any, could be most suitable to store our cipher
- Consider which data structure, if any, could be most suitable to store our input and output data.

In [44]:
cypher={
    '1':'3',
    '2':'5',
    '3':'1',
    '4':'2',
    '5':'4',
    'a':'e',
    'e':'i',
    'i':'u',
    'o':'a',
    'u':'o'
}


def encrypt(cypher: dict, *list_sentence: str):
    return tuple((( ''.join([ cypher.get(char,char) for char in  sentence]) for sentence in list_sentence )))



s1="hello 123"
s2="900 minus 600 is 300"

print(encrypt(cypher,s1,s2))


('hilla 351', '900 munos 600 us 100')


# Ex. 02 - Scrambled words
Write a program that will scramble each word of a text given as argument or read from a file. Scrambling applies only to words of at least length four and it modifies only the letters in the middle of the word: the first and last letter must always remain the same.

> Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.

## Hints
- We can use a regular expression to split text into words
- We can use the `random.shuffle()` function to shuffle a list

In [1]:
import re
import random


s="""Write a program that will scramble each word of a text given as argument or read from a file. Scrambling applies only to words of at least length four and it modifies only the letters in the middle of the word: the first and last letter must always remain the same."""
SIZE=4

def checkLen(s:str):
    return len(s)>=SIZE

def shuffle(st:str):
    mid=list(st)
    random.shuffle(mid)
    return ''.join(mid)

def shuffleString(sentence:str):
    pattern = r'\W+'
    words = re.split(pattern, sentence)
    result=[]

    for word in words:
        if checkLen(word):
            f, *middle, l=word
            word=f+''.join(shuffle(middle))+l
            result.append(word)

    return ' '.join([word for word in result])
    

shuffleString(s)




'Wtire pogarrm taht wlil scrbmale each word txet gvein aermngut read form flie Siacmnblrg apeipls only wdros least lnegth four mdiieofs olny ltretes mldide wrod first lsat lteter msut awlays raimen same'

# Ex. 03.1 - Very strong passwords
Password strength is a very pressing security issue. Let's write a program capable of generating password according to a very basic rule: create a long password by joining some common words together (bonus: [Relevant XKCD](https://xkcd.com/936)).

Our passwords generator shall read an input file containing le list of words we are willing to choose from and then randomly select some to generate our new password. On UNIX machines, we can find a list of common words under `/usr/share/dict/words` or `/usr/dict/words`. If those are unavailable, a sample list of words is hosted here (http://www.cs.duke.edu/~ola/ap/linuxwords). Another source of words in English is available [here](https://www-personal.umich.edu/~jlawler/wordlist).

Password generation must be parametrized taking into account the following criteria:
- Number of choosen words
- Minimum and maximum length for a given choosen word
- Minimum and maximum length for the overall password

In [78]:
import itertools

WORDS=3
WMIN,WMAX=1,5
OVERALL_MIN,OVERALL_MAX=5,10

passwords=dict()


def readDictinaryPassword(filename:str):
    with open(filename,"r") as f:
        for line in f.readlines():
            words=line.strip()
            if len(words) not in passwords:
                passwords[len(words)]=[words]
            else:
                passwords[len(words)].append(words)


def generatePassword():
    if not OVERALL_MIN>=WMIN*WORDS :
        return "NO PASSWORD WITH THAT CONSTRAINT"
    l=[passwords.get(i,[]) for i in range(WMIN,WMAX+1)]
    flat_list=list(itertools.chain(*l))
    
    e=random.sample(flat_list,WORDS)
    while not (len(''.join(e))>=OVERALL_MIN and len(''.join(e))<= OVERALL_MAX):
        e=random.sample(flat_list,WORDS)
    
    return ''.join(e)
    

readDictinaryPassword("listpaswd.txt")
your_password=generatePassword()
print(your_password)



utalebenlf


# Ex. 03.2 - Very strong passwords (133t)
It is quite common for a password to require numbers and special characters in addition to alphabetic characters. Include an additional function at the end of the password generation phase of Ex 03.1 to remap some of its character to a different value in order to make the password more robust: e.g., change all the `o` characters to `0` and all `a` to `@`. Note that you can use a substitution cipher similar to the one used in Ex. 01.

In [80]:
cypher={
    'o':'0',
    'a':'@',
    'l':'1',
    'g':'9'
}

def makeStrong(password:str):
    return ''.join([ cypher.get(w,w) for w in password]) 

strong_password=makeStrong(your_password)
print(strong_password)

ut@1eben1f


# Ex. 03.3 - Very strong passwords (custom vocabulary)
Rather than acquiring a list of preset words, let the program parse one or more text files in order to collect words. Make sure that at the end of the collection phase, all the retrieved words are unique in the collection.
For instance it is possible to use Shakespeare's sonnets as inputs, which can be found in the [Internet Archive](https://archive.org/details/shakespearessonn01041gut). Any other public domain corpus would be suitable as well (see [Project Gutenberg](https://archive.org/details/gutenberg)).

In [6]:
import re,random
import itertools

WORDS=3
WMIN,WMAX=1,5
OVERALL_MIN,OVERALL_MAX=5,10

passwords=dict()

file1="text1.txt"
file2="text2.txt"
outputfile="output.txt"



def veryStrongPassword(*files_name):
    pattern = r'\W+'
    words=set()
    out=open(outputfile,"w")

    for file in files_name:
        with open(file,"r") as f:
            l=[ filter(None, re.split(pattern, row.lower().strip())) for row in f.readlines() ]
            merged = set(itertools.chain(*l))
            f.close()
        words.update(merged)
    out.writelines('\n'.join(words))  


def readDictinaryPassword(filename:str):
    with open(filename,"r") as f:
        for line in f.readlines():
            words=line.strip()
            if len(words) not in passwords:
                passwords[len(words)]=[words]
            else:
                passwords[len(words)].append(words)


def generatePassword():
    if not OVERALL_MIN>=WMIN*WORDS :
        return "NO PASSWORD WITH THAT CONSTRAINT"
    l=[passwords.get(i,[]) for i in range(WMIN,WMAX+1)]
    flat_list=list(itertools.chain(*l))
    
    e=random.sample(flat_list,WORDS)
    while not (len(''.join(e))>=OVERALL_MIN and len(''.join(e))<= OVERALL_MAX):
        e=random.sample(flat_list,WORDS)
    
    return ''.join(e)
    




veryStrongPassword(file1,file2)

readDictinaryPassword("output.txt")
your_password=generatePassword()
print("Password "+your_password)

cypher={
    'o':'0',
    'a':'@',
    'l':'1',
    'g':'9'
}

def makeStrong(password:str):
    return ''.join([ cypher.get(w,w) for w in password]) 

strong_password=makeStrong(your_password)
print("Strong passowrd: "+strong_password)


Password longset5
Strong passowrd: 10n9set5


# Ex. 04 - Outliers
When analysing data collected as part of a science experiment it may be desirable to remove outliers before performing other calculations. 
Write a function that takes a sequence of values and an non-negative integer `n` as parameters. The function shall create and return  a new copy of the list with the `n` largest elements and the `n` smallest elements removed. 

## Hints
- The order of the items in the returned list does not have to match the original one
- It would be beneficial to handle appropriate corner cases, such as a value of `n` incompatible with the sequence length

In [39]:
import random
N=5
test_values=[ random.randint(0,1000) for _ in range(100) ]

def outliers(values:list, n:int):
    to_remove=list()
    s=sorted(values)
    for _ in range(N):
        if len(s)< 1:
            return []
        minimo=min(s)
        maximo=max(s)
        s=sorted(filter((maximo).__ne__ ,s))
        s=sorted(filter((minimo).__ne__ ,s))
    return s

out=outliers(test_values,N)
print(out)

[74, 78, 93, 102, 104, 105, 114, 116, 117, 128, 147, 147, 155, 156, 167, 191, 207, 238, 239, 243, 249, 267, 267, 267, 270, 274, 288, 315, 328, 334, 353, 357, 358, 360, 391, 404, 409, 412, 415, 418, 418, 419, 470, 488, 501, 502, 503, 529, 553, 557, 560, 600, 606, 630, 640, 642, 643, 648, 681, 702, 706, 711, 724, 738, 743, 748, 752, 755, 757, 759, 760, 784, 789, 796, 797, 808, 813, 820, 829, 848, 859, 870, 872, 901, 915, 925, 928, 930, 940, 952]


# Ex. 05 - Linear best-fit
A line of best-fit is a straight line that best approximates a collection of `n` data points. We are interested in points in the two-dimensional plane.
The target function represented by the equation $y = mx + b$ where $m$ and $b$ are calculated using the following formulas:
$$
b = \bar{y} - m \bar{x}
$$
$$
m = \frac{\sum xy-\frac{\sum x \sum y}{n}}{\sum x^2-\frac{(\sum x)^2}{n}}
$$
Write a program that reads a collection of points and iteratively evaluates $m$ and $b$ as new points are discovered.

# Ex. 06 - n-dice simulation
Write a program to simulate empirically the expeted results of rolling `n` dice a parametrized number of times `r`.
The program should iteratively count and update the number of times that each total occurs. 
Then it should display a table that summarizes collected data, showing both the empirical frequency for each roll as well as displaying the percentage expected by probability theory for each total.

## Example
Considering the values `n=2` and `r=1000` we could obtain something in line with:

| Sum | %Simulated | %Expected |
|-----|------------|-----------|
| 2   | 2.61       | 2.78      |
| 3   | 6.89       | 5.56      |
| 4   | 7.21       | 8.33      |
| ... | ...        | ...       |