
# Practice 1: Solving problems by search. 

## Uninformed and informed search strategies.

<center><h3>
    Name and Surename
</h3></center>


## Instructions

This is `Jupyter Notebook`, a document that integrates Python code into a Markdown file.
This allows us to execute code cells step by step, as well as automatically generate a well-formatted report of the practice.

You can add a new cell using the "Insert" button in the toolbar and change its type with "Cell > Cell Type".

To execute a code cell, select it and press the "▶ Run" button in the toolbar.
To convert the document to HTML, select "File > Download as > HTML (.html)".

Follow this script to the end. Execute the provided code step by step, understanding what you are doing and reflecting on the results. There will be questions interspersed throughout the script; answer all of them in the designated section: "Responses to the questionnaires". Please do not modify any line of code unless explicitly instructed to do so.

Don't forget to insert your name and surname in the top cell.

## Submision of the practice

The submission deadline will be the one indicated in the Virtual Campus. The submission will consist of a single compressed file named `LASTNAME_FIRSTNAME_P1.zip`, containing the following files:

* `LASTNAME_FIRSTNAME_P1.html`: An HTML file generated from exporting this Notebook, with the answered questions at the end of the document.
* `LASTNAME_FIRSTNAME_P1.ipynb`: The source Jupyter Notebook file.
* The data file(s) used for problem-solving.


## Python preliminaries


Here you have some Python functions that may be helpful in the near future while developing this practice.


For example, you can generate random numbers using the package `random`.

In [None]:
import random

# we can create a random number between 1 and 10
random_number = random.randint(1, 10)
print(random_number)

# and random numbers between 0 and 1 following a uniform distribution
random_U = random.uniform(0, 1)
print(random_U)


You can generate vectors of fixed and random numbers that are also shuffled randomly, as illustrated below.

In [None]:
vector = [x for x in range(1, 10)]
print("fixed vector", vector)

random.shuffle(vector)
print(vector)

random_vector = [random.randint(1, 10) for i in range(1, 10)]
print("random vector", random_vector)

random.shuffle(random_vector)
print(random_vector)

Another important set of functions comes from the math module. You can find a list of available functions at https://docs.python.org/3/library/math.html. Below are some usage examples.

In [None]:
import math 

# number e raised to the specified power
e = math.exp(1)
print(e)

power2_e = math.exp(2)
print(power2_e)

# example of exponentiation
print(math.pow(e, 1))
print(math.pow(e, 2))

# example of the natural logarithm with base e
base = e
print(math.log(e))
print(math.log(e, base))

Finally, functions from the time module allow you to approximately measure the execution time of specific sections of code.

In [None]:
import time
start_time = time.time()

sum = 0
for i in range(1000000):
    sum = sum * 1

print("---- %s segundos ----" % (time.time() - start_time))

## The Traveling Salesperson Problem (TSP)

The objective of this practice is to model and implement an intelligent agent capable of solving the Traveling Salesperson Problem. To achieve this, you will implement the basic algorithm covered in the lecture and evaluate whether introducing modifications to the algorithm's design improves the quality of the solutions obtained.


### Problem definition


The Traveling Salesperson Problem (TSP) involves a salesperson who wants to sell a product and, to do so, needs to find the shortest possible route through the cities of their customers, visiting each city only once and starting and ending the journey in their own city (a circular route from the initial city).

Typically, the problem is represented using a weighted graphG=(N, A), where N is the set of n=|N| nodes (cities), and A is the set of arcs connecting the nodes. Each arc(i, j) ∈ A is assigned a weight d_ij which represents the distance between cities i and j.


To facilitate your implementation work, we provide the Localizaciones class, which allows you to load the GPS locations representing the vertices of the graph G of N cities. It also enables you to transparently calculate the distance between any pair of cities using the haversine formula https://en.wikipedia.org/wiki/Haversine_formula, which accounts for the Earth's curvature when computing distances.

First, import the Python module that accompanies this practice, which includes some implemented support functions as well as the data loading class.

In [None]:
from helpers_mod_sa import *

Inspect the location loading code using psource(Localizaciones).

In [None]:
psource (Localizaciones)

Note that by default, the file ./data/grafo8cidades.txt is loaded, which contains the GPS coordinates of 8 Galician cities, with Santiago de Compostela being the first one. The first line of these files indicates the number of cities n, while each of the following lines specifies the coordinates of each city, given as GPS coordinates (latitude and longitude in degrees).

You can load a different file by using the filename parameter, as shown below. If everything works correctly, the first distance between city 0 and city 1 should be approximately 55 km.

In [None]:
g1=Localizaciones(filename='./data/grafo8cidades.txt')
print (g1.distancia(0,1))

The TSP can be formulated as the problem of finding the shortest Hamiltonian circuit in the graph G. A solution to a TSP instance can be represented as a permutation of city indices, where the order of visits determines the total travel cost in terms of distance.

Since there are n! possible permutations, the problem belongs to the NP category, making it computationally expensive to solve as  n increases. In this practice, you will first explore uninformed search strategies to tackle the problem and evaluate their feasibility. Then, you will implement an informed approach, such as greedy search, to compare its efficiency and effectiveness in finding good solutions.

Later in the course, we will explore more advanced techniques, such as metaheuristic approaches, which allow solving larger instances of the problem more efficiently. These methods, such as simulated annealing or genetic algorithms, can provide high-quality solutions in a reasonable amount of time, making them suitable for real-world applications.


## P1.1: Implement the Basic Breadth-First Search (BFS) Algorithm for the TSP


Implement the basic Breadth-First Search (BFS) algorithm to solve the Traveling Salesperson Problem (TSP) as stated above. To do so, review the algorithmic description of uninformed search techniques covered in the lecture (See T1, slide 37 and related slides).

Take into account the following design considerations to complete the basic implementation:

* Solution Representation: The solution should be represented as an ordered sequence (permutation) of cities, starting and ending at city 0.

* Initial State: The search begins with an empty path, where the starting city (0) is the first node.

* Successor Function: The next states are generated by expanding the current path to all unvisited cities.

* Cost Function: The cost of a solution is the total sum of distances along the path.

* Search Mechanism: BFS explores all possible sequences of cities level by level, ensuring that shorter paths are explored first.

* Stopping Condition: The algorithm terminates when a complete tour (covering all cities and returning to the start) is found, or when all possibilities are exhausted.

For verifying your implementation, you can use the location file containing 8 Galician cities (grafo8cidades.txt). The optimal solution, obtained using an informed search such as A*, is approximately 382 km.


In [None]:
# Write your code here for the function that implements the Breadth-First Search (BFS) algorithm
# Create as many cells as you find necessary to write your code
# Always document your code with comments like this



❓ **Question 1**. How does BFS guarantee finding the optimal solution for small instances?

❓ **Question 2**. What is the main limitation of BFS when applied to large instances of the TSP?

Notes: Be conservative in your strategy for verifying your implementation, especially when working with large data files like the USA cities problem. If you run your algorithm for a high number of iterations, it may be useful to measure the execution time to make informed decisions about where to set the limit.

# P1.2: Implement the Basic Greedy Search Algorithm for the TSP

Now, you will implement the Greedy Search algorithm to solve the Traveling Salesperson Problem (TSP) as stated above. This algorithm follows a heuristic approach to construct a solution step by step by always selecting the nearest unvisited city from the current position.

Design Considerations for the Implementation:

* Solution Representation: The solution should be represented as an ordered sequence (permutation) of cities, starting and ending at city 0.

* Initial State: The search starts from city 0.

* Successor Function (Greedy Choice): At each step, the next city is chosen as the closest unvisited city based on the current position.

* Cost Function: The cost of a solution is the total sum of distances along the path.

* Search Mechanism: The algorithm constructs a single path by iteratively adding the nearest unvisited city until all cities are visited, then returning to the starting city.

* Stopping Condition: The algorithm terminates when all cities have been visited and the path is completed by returning to the starting city.

Again, for verifying your implementation, you can use the location file containing 8 Galician cities (grafo8cidades.txt). The optimal solution, obtained using an informed search such as A*, is approximately 382 km.

In [None]:
# Write your code here for the function that implements the Greedy search algorithm
# Create as many cells as you find necessary to write your code
# Always document your code with comments like this



❓ **Question 1**. Why is the greedy approach considered a heuristic method?

❓ **Question 2**. Can the greedy approach guarantee finding the optimal solution? Why or why not?

❓ **Question 3**. Describe a scenario where the greedy algorithm may lead to a suboptimal path.

--
### General questions

Compare the time complexity of BFS and Greedy Search in solving the TSP.

❓ **Question 1**. Which algorithm scales better for large instances?

❓ **Question 2**. In terms of memory usage, which algorithm is more efficient? Why?

❓ **Question 3**. Compare the solutions found by both algorithms for the 8-city instance (grafo8cidades.txt):

❓ **Question 4**. What is the total distance of the tour found by BFS?

❓ **Question 5**. What is the total distance of the tour found by the Greedy algorithm?

❓ **Question 6**. How do these distances compare to the optimal solution (~382 km with A*)?