# AI - CA2 - Genetics - Mohamad Taha Fakharian

## Goal
In this assignment, we're going to formulate a searching problem and try to solve that using uninformed(BFS and IDS) and informed(A* and Weighted A*) algorithms.

The problem is described as follows: Gandalf wants to deliever fellows from some places to their destiation and after that, he goes to Gondor. The problem isn't that easy: There are some sleeping orks in the map and Gondolf shouldn't awake them!

## Overall approach
So let's solve the problem! First let's import some libraries and define some primary functions to formulate the problem and then solve it using methods mentioned above:

In [1]:
# import some libraries
import numpy as np
import pandas as pd
from re import split

import string
import random
from math import floor

In [2]:
CONSTANT = 14

In [3]:
class Decoder:
    
    def __init__(self, global_text, encoded_text, key_length = CONSTANT):
        self.words_dict = self.make_dict(global_text)
        self.raw_text = encoded_text
        self.key_length = key_length
        self.encoded_words = self.make_words(encoded_text)
        
        self.key = None
        self.mutation_prob = 0.09
        self.pop_size = 100
        self.total_generations = 10000
        self.chromosomes = self.generate_initial_population()
        self.elites_perc = 30
        
        self.cur_index = 0
        
    def make_dict(self, global_text):
        dictionary = {}
        for word in split('[^a-zA-Z]+', global_text):
            if not word:
                continue
            word = word.lower()
            if word[0] not in dictionary:
                dictionary[word[0]] = set()
            dictionary[word[0]].add(word)
        return dictionary
    
    def make_words(self, encoded_text):
        return [word.lower() for word in split('[^a-zA-Z]+', encoded_text) if word]
    
    
    def convert(self, encoded, chromosome):
        alphabet = (ord(encoded) - ord(chromosome[self.cur_index])) % 26 + 97
        converted = chr(alphabet)
        self.cur_index = (self.cur_index + 1) % self.key_length
        return converted
    
    def calc_fitness(self, chromosome):
        fitness = 0
        self.cur_index = 0
        for word in self.encoded_words:
            decoded = ''.join(self.convert(encoded, chromosome) for encoded in word)
            if (decoded[0] in self.words_dict) and (decoded in self.words_dict[decoded[0]]):
                fitness += 1
        return fitness
    
    def crossover(self, first_chromosome, second_chromosome):
        point = floor(self.key_length / 2)
        return first_chromosome[:point] + second_chromosome[point:], second_chromosome[:point] + first_chromosome[point:]
    
    def mutate(self, chromosome):
        new_chromosome = chromosome
        if (random.random() < self.mutation_prob):
            gene = random.randint(0, self.key_length - 1)
            candidate = chromosome[:gene] + random.choice(string.ascii_lowercase) + chromosome[gene+1:]
            if self.calc_fitness(candidate) > self.calc_fitness(chromosome):
                new_chromosome = candidate
        return new_chromosome 
    
    def generate_initial_population(self):
        return [''.join(random.choice(string.ascii_lowercase) 
                for i in range(self.key_length)) 
                for j in range(self.pop_size)]
    
    def generate_new_population(self):
        sorted_chromosomes = [(self.calc_fitness(chromosome), chromosome) for chromosome in self.chromosomes]
        sorted_chromosomes.sort(reverse=True)
        print("Best: key = {}, fitness = {}".format(sorted_chromosomes[0][1], sorted_chromosomes[0][0]))
        
        if sorted_chromosomes[0][0] == len(self.encoded_words):
            self.key = sorted_chromosomes[0][1]
            return self.chromosomes
        
        elites_num = floor((self.elites_perc / 100) * (self.pop_size))
        elites = sorted_chromosomes[:elites_num]
        new_pop = [self.mutate(chromosome[1]) for chromosome in elites]
        crossovering = [chromosome[1] for chromosome in sorted_chromosomes[:(self.pop_size - elites_num)]]
        random.shuffle(crossovering)
        
        while(len(new_pop) != self.pop_size):
            first_parent = crossovering.pop(0)
            second_parent = crossovering.pop(0)
            
            first_child, second_child = self.crossover(first_parent, second_parent)
            first_child = self.mutate(first_child)
            second_child = self.mutate(second_child)
            
            new_pop.append(first_child)
            new_pop.append(second_child)
        return new_pop
    
    def decode(self):
        for i in range(self.total_generations):
            if self.key:
                return self.show_text()
            self.chromosomes = self.generate_new_population()
        return "Couldn't decode!"
    
    def show_text(self):
        decoded_text = '\n\nDecoded text:\n\n'
        self.cur_index = 0
        for character in self.raw_text:
            if character >= 'A' and character <= 'Z':
                character = character.lower()
                decoded_text += self.convert(character, self.key).upper()
            elif character >= 'a' and character <= 'z':
                decoded_text += self.convert(character, self.key)
            else:
                decoded_text += character
        return decoded_text

In [4]:
import time
encoded_text = open('encoded_text.txt').read()
global_text = open('global_text.txt').read()

decoder = Decoder(global_text, encoded_text)
t0 = time.time()
print(decoder.decode())
t1 = time.time()

print("Time spent = {}".format(t1 - t0))

Best: key = hckczaeifyccdu, fitness = 35
Best: key = jjbeciazurjsqc, fitness = 37
Best: key = azjecpbnprtecm, fitness = 40
Best: key = azjecpbnprtecm, fitness = 40
Best: key = jjbecianprtecm, fitness = 42
Best: key = jjbecianprtecm, fitness = 42
Best: key = iqpfwtfnprtecm, fitness = 45
Best: key = iqpfwtfnprtecm, fitness = 45
Best: key = jjbesianprtecm, fitness = 45
Best: key = jjbesianprtecm, fitness = 45
Best: key = jjbesianprtecm, fitness = 45
Best: key = iqpfwtfnprtech, fitness = 46
Best: key = jrbesianprtecm, fitness = 48
Best: key = jrbesianprtecm, fitness = 48
Best: key = jrbesianprtecm, fitness = 48
Best: key = jrbesianprtecm, fitness = 48
Best: key = jrbesianprtecm, fitness = 48
Best: key = jjbessgnprtecm, fitness = 50
Best: key = jrbeseanprtecm, fitness = 50
Best: key = jrbeseanprtecm, fitness = 50
Best: key = zrbeseanprtecm, fitness = 52
Best: key = zrbeseanprtecm, fitness = 52
Best: key = jlbesianprtecm, fitness = 53
Best: key = jlbesianprtecm, fitness = 53
Best: key = jlbe

# Conclusion
Picking a good searching algorithm depends on our needs. If memory is not a deal, Using BFS is a good choice but if we can define a good heuristic function for our problem, $A*$ is a good choice and this approach can be faster using Weighted $A*$.