### Project Description

### Key Features:
- **Ant Colony Optimization:** Implements ACO to optimize parameters for the GOR prediction model, providing a novel approach to solving this complex problem.
- **Statistical Analysis:** Includes detailed statistical analysis comparing the new model's performance with established correlations like Standing’s, Glaso’s, and Petrosky’s.
- **Streamlit Deployment:** Deploys the prediction model as an interactive web application using Streamlit, enabling users to input PVT data and receive GOR predictions ins### Detailed Explanation:
The Ant Colony Optimization Algorithm inspired our approach to optimizing the gas-oil ratio (GOR) prediction model. Here's how the principles of ant colony behavior relate to our problem:

- **Decentralized Intelligence:** Like ants use decentralized intelligence to find food, our algorithm uses multiple agents (ants) to explore the solution space for the best-fit parameters.
- **Pheromone Trails:** Ants communicate via pheromones, leaving trails that signal the path to food. Similarly, our algorithm leaves "pheromone" signals on promising paths (parameter combinations) that lead to accurate GOR predictions.
- **Exploration and Exploitation:** Ants balance following existing pheromone trails and exploring new paths. Our algorithm also balances between exploiting known good parameters and exploring new combinations to avoid local minima.
- **Convergence:** Over time, ants converge on the shortest path to food. Our algorithm defines the best parameter combinations as the "shortest path" to accurate GOR predictions.

### Results:
- **Superior Accuracy:** The developed correlation outperforms traditional methods, showing lower average relative error and higher correlation coefficients.
- **Enhanced Reliability:** Engineers can rely on this newly developed correlation after validation with field data, ensuring better accuracy in the region where the correlation was developed.

### Conclusion:
The project demonstrates the potential of ACO in optimizing GOR predictions and highlights the benefits of the developed correlation over traditional methods. Future work includes further refinement of the ACO algorithm for faster convergence and broader applicability.

### References:
- Ant Colony Optimization Algorithms: https://en.wikipedia.org/wiki/Ant_colony_optimization_algorithms

- Gas-Oil Ratio: https://www.sciencedirect.com/topics/engineering/gas-oil-ratioconvergence and broader applicability.


### Importing Dependencies

In [4]:
import sympy as smp
from sympy import*
import numpy as np
from sklearn.linear_model import LinearRegression

### Optimization Algorithm Class

#### Class Definition and Initialization

In [2]:
class AntColonyOptimization:
    def __init__(self, pvt_data, num_ants=10, num_iterations=100, decay=0.95, alpha=1, beta=2):
        self.pvt_data = pvt_data
        self.num_ants = num_ants
        self.num_iterations = num_iterations
        self.decay = decay
        self.alpha = alpha
        self.beta = beta
        self.distance_matrix = self.calculate_distance_matrix()
        self.pheromone_matrix = np.ones_like(self.distance_matrix) / len(pvt_data)
        self.shortest_path = None
        self.shortest_cost = np.inf



The `__init__` method initializes the algorithm with the given parameters and input data:

- **pvt_data**: The input data, typically representing properties or measurements of interest.
- **num_ants**: Number of ants to simulate in the colony, influencing the exploration of paths.
- **num_iterations**: Number of iterations to run the algorithm, affecting how many times the ants will search for optimal paths.
- **decay**: Rate at which pheromones deposited on paths evaporate over time, influencing path exploration.
- **alpha**: Parameter determining the influence of pheromone levels on ant path selection.
- **beta**: Parameter determining the influence of distance between points on ant path selection.
- **distance_matrix**: Matrix representing distances between points in the input data, calculated using the `calculate_distance_matrix` method.
- **pheromone_matrix**: Initial matrix representing pheromone levels on each path, initialized uniformly.
- **shortest_path** and **shortest_cost**: Variables used to store the best path found by the ants and its associated cost throughout the algorithm'er.
 execution.


#### Distance Matrix Calculation

In [10]:
def calculate_distance_matrix(self):
    num_points = len(self.pvt_data)
    dist_matrix = np.zeros((num_points, num_points))
    for i in range(num_points):
        for j in range(num_points):
            if i != j:
                # Simple distance metric based on the absolute difference in 'bubble_point_pressure'
                dist_matrix[i, j] = np.abs(self.pvt_data[i]['bubble_point_pressure'] - self.pvt_data[j]['bubble_point_pressure'])
    return dist_matrix

The `calculate_distance_matrix` function in our algorithm takes in data about different points (or samples) characterized by their bubble point pressures. It then computes a matrix that quantifies the differences in bubble point pressures between each pair of points. This matrix is crucial because it defines the distances between points based on their pressure variations. In simpler terms, it helps the algorithm understand how similar or different each pair of points is in terms of their bubble point pressures.

This understanding is essential for tasks such as finding optimal paths or clustering similar data points together. For our specific algorithm, this distance matrix serves as a foundation for calculating probabilities that guide ants in finding paths with potentially better solutions, contributing to the optimization process.

#### Generating Ant Paths

In [13]:
def generate_ant_paths(self):
        num_points = len(self.pvt_data)
        ants_paths = []
        for _ in range(self.num_ants):
            start = np.random.randint(num_points)  # Randomly choosing a starting point
            path = [start]
            visited = set([start])
            while len(visited) < num_points:
                probs = self.calculate_probabilities(path[-1], visited)  # Calculate probabilities of next steps
                next_point = np.random.choice(num_points, p=probs)  # Choose the next point based on probabilities
                path.append(next_point)
                visited.add(next_point)
            ants_paths.append(path)
        return ants_paths

The algorithm's `generate_ant_paths` function builds paths for several simulated ants. Finding the total number of points (or samples) in the dataset is the first step. The beginning location for every ant (specified by {self.num_ants}) is chosen at random from the list of points. The ant then determines the probability of the subsequent action based on its present position and the points it has already visited, even if it hasn't yet visited every point.

The ant uses these probabilities to determine where to move next, with greater probability favoring places that are closer to the algorithm's goals (such as maximizing pheromone trails or minimizing distances). Until every point has been reached and every ant has created a full path, this procedure is repeated. This feature is essential for modeling how ants explore and navigate through potential solutions, contributing to the overall optimization process of finding better paths or clusters in the data.


##### Calculating Probabilities

In [16]:
def calculate_probabilities(self, current_point, visited):
        pheromone = self.pheromone_matrix[current_point]  # Pheromone levels for the current point
        dist = self.distance_matrix[current_point]  # Distances from the current point to others
        unvisited_prob = np.where(np.isin(np.arange(len(pheromone)), list(visited)), 0, 1)  # Only consider unvisited points
        row = pheromone ** self.alpha * (unvisited_prob * (1.0 / (dist + 1e-10)) ** self.beta)  # Calculate probabilities
        probabilities = row / np.sum(row)  # Normalize to sum to 1
        return probabilities

##### Distance Calculation in the Context of Ant Colony Optimization (ACO)
<p align="center">
  <img src="../image/Capture.PNG" alt="Description of the image" width="500" height="300">
</p>


In your project, `len(self.pvt_data)` gives you the number of data points or entries in your PVT dataset. Each data point represents a specific combination of variables such as pressure, temperature, gas solubility, etc. These variables are crucial for calculating distances between different points in your dataset. This length determines the number of nodes or points that the Ant Colony Optimization (ACO) algorithm navigates when searching for optimal paths or solutions related to GOR estimatio
*#*#### Probability Calculation in Ant Colony Optimization (**ACO)

In the Ant Colony Optimization (ACO) algorithm, an ant's probability of moving from its current point to another point primarily depends on two factors: pheromone levels and heuristic val>

es.

**Pheromone evels:**
- Pheromones are substances that ants deposit as they move.
- In ACO, pheromone levels on each path between points are dynamically updated.
- Ants use pheromone levels to choose paths: higher levels indicate better or more frequently used paths.
- The probability of choosing a path is influenced by pheromone levels, controlled by a paramete (alpha).

**HeuristicInformation:**
- Heuristic values guide ants towards shorter paths.
- They represent additional knowledge, typically inversely related- Ants prioritize paths based on heuristic values compared to pheromone levels, controlled by a parameter (beta).
 optimization problems.


##### Choosing the Next Point

Once probabilities are calculated for each unvisited neighboring point from the current point, the next point is chosen randomly based on these probabilities. Points with higher probabilities are more likely to be selected, encouraging ants to follow paths with stronger pheromone trails and shorter distances.

In summary, the ACO algorithm balances exploiting known paths (high pheromone levels) and exploring potentially shorter paths (low distance heuristic values). This balance allows it to converge towards optimal or near-optimal solutions for complex optimization problems.

##### Understanding Data Points in Your Project

In this project's context and code, `len(self.pvt_data)` represents the number of data points or entries in your PVT dataset. Each data point encapsulates specific variables like pressure, temperature, gas solubility, and more. These variables are pivotal for computing distances between various points within your dataset. The number of data points dictates the nodes or points that the Ant Colony Optimization (ACO) algorithm will traverse when seeking optimal paths or solutions related to GOR estimation
ameter (beta).

##### Updating Pheromones

In [21]:
def update_pheromone(self, ants_paths):
    self.pheromone_matrix *= self.decay  # Apply decay to all pheromone levels
    for path in ants_paths:
        for i in range(len(path) - 1):
            self.pheromone_matrix[path[i], path[i+1]] += 1.0 / self.distance_matrix[path[i], path[i+1]]  # Increase pheromone levels on the path


Our algorithm's `update_pheromone` function is essential for modifying the pheromone levels on pathways in response to the ants' investigation. It first simulates the gradual evaporation or reduction of pheromones by applying a decay factor ({self.decay}) to all current pheromone levels. By doing this, the algorithm is kept from unduly rewarding paths that were once good but might not be optimal.

The ant then iterates through the path steps for each path it has examined ({ants_paths}). The pheromone level on the matrix ({self.pheromone_matrix}) between the current position ({path[i]}) and the following point ({path[i+1]}) is raised with each step. Shorter paths receive greater reinforcement since the quantity added is inversely proportional to the distance between these places ({self.distance_matrix[path[i], path[i+1]]}).

This procedure improves pathways with higher pheromone concentrations, guiding subsequent ants to explore potentially more optimal paths as the algorithm progresses. Thus, the function dynamically adjusts pheromone levels to reflect the exploration outcomes, facilitating the convergence towards optimal solutions over successive iterations.

##### Getting the Shortest Path

In [24]:
def get_shortest_path(self, ants_paths):
    shortest_cost = np.inf
    shortest_path = None
    for path in ants_paths:
        path_cost = self.calculate_path_cost(path)  # Calculate the cost of the path
        if path_cost < shortest_cost:
            shortest_cost = path_cost
            shortest_path = path
    return shortest_path, shortest_cost

##### Identifying "Path" and "Cost" in Our Approach

**Path**: The sequence of points an ant travels through in our method is its path. Each point in our PVT data represents a unique data entry, distinguished by its bubble point pressure. Here, the ant's itinerary is essentially its visitation sequence.

**Cost**: A path's "good" or "efficient" qualities are ascertained by its price. Here, the successive point distances of the pathways are used to calculate the cost. The cost decreases with improving path quality. Here, the distance is determined by the difference in bubble point pressures between the sites.

##### Put Simply:

**Path**: An ant's visitation of a series of data points, each of which is identified by its bubble point pressure.

**Cost**: The overall distance covered by the ant on this path, calculated from the variations in bubble point pressures at each location. A less expensive route is also quicker and more effective.
 path.

##### How to use get_shortest_path Functions in Our Situation

**First Configuration:** presuming that the shortest path has an infinite cost, we begin by presuming that its identity is unknown.
Analyze Every Route: We examine every route that the ants have taken.

**Compute Cost:** Depending on the variations in bubble point pressures between successive places, we compute the total distance (or cost) the ant travels along each path.


##### Path Cost Calculation

To get the overall distance (or cost) of traveling a path, the calculate_path_cost function sums up the distances between each succeeding point in the path. The path is a series of data points with respective bubble point pressures assigned to them; the variations in these pressures determine the separation between the locations.


In [31]:
def calculate_path_cost(self, path):
    path_cost = 0
    for i in range(len(path) - 1):
        path_cost += self.distance_matrix[path[i], path[i+1]]
    return path_cost

In this case, the distances between each path point are used to calculate the cost. The difference in the BPP values of two points is what determines their distance from one another. The path is seen more efficient when the total difference in BPP along it is smaller, as this results in reduced costs.

