# **Oilwell Placement Solutions in OilyGiant Petroleum Company**

# Project Goals

As an analyst in OilyGiant petroleum company. I need to find a suitable location to dig a new oilwell. Here are the steps to select a new location:

- Collect parameters to build oilwells in selected areas: the oil quality and the volume of oil reserves;
- Create a model that can predict the volume of oil reserves in new wells;
- Choose the oilwell with the highest estimated value;
- Select the region with the highest total profit for the selected oilwells.

I have oil sample data from three regions. The parameters of each oilwell in the area were already known. Create a model that will help to select the territories with the highest profit margins. Perform an analysis of potential profits and risks using bootstrapping techniques.

# Project Instructions

1. Download and prepare the data. Describe the procedure that will be performed.
2. Train and test the model for each region:
    - 2.1. Separate data into training set and validation set with a ratio of 75:25.
    - 2.2. Train the model and make predictions for the validation set.
    - 2.3. Save predictions and the correct answers for validation sets.
    - 2.4. Show predicted average volume of oil reserves and RMSE model.
    - 2.5. Analyze the results.
3. Make a preparations to calculate profits:
    - 3.1. Store all key values for profit calculations in separate variables.
    - 3.2. Calculate the volume of oil reserves in sufficient amount to develop a new well without loss. Compare the values that already obtained with the average volume of oil reserves in each region.
    - 3.3. Present the findings regarding preparation for profit calculation.
4. Create a function to calculate the profit from a selected set of oil wells and create a prediction model:
    - 4.1. Choose the well with the highest predicted value.
    - 4.2. Summarize the target volume of oil reserves based on these predictions
    - 4.3. Propose an area for oil well development and provide justification or reasons for your choice. Calculate the profit for the volume of acquired oil reserves.
5. Calculate the risk and profits for each region:
    - 5.1. Use the bootstrapping technique with 1,000 samples to find the profit distribution.
    - 5.2. Find the average profit, with 95% confidence interval, and risk of loss. Losses are negative gains, calculate the probability of possible losses and state in percentage.
    - 5.3. Present your findings: suggest an area for oil well development and include justification or reasons for your choice.

# Geological exploration data for the three areas were stored in several files:

- `geo_data_0.csv`
- `geo_data_1.csv`
- `geo_data_2.csv`
- `id` — Unique ID of oilwell
- `f0, f1, f2` — three point feature (the specific meaning is not important, but the feature itself is significant)
- `product` — volume of oil reserves in wells (thousands of barrels)

# Conditions:

- Only linear regression is suitable for model training (the rest are not sufficient to make a prediction).
- When exploring the region, we carried out a study of 500 points by selecting the best 200 points for profit calculation.
- The budget to develop 200 oil wells is 100 million USD.
- A barrel of raw materials generates 4.5 USD of revenue. The income from one unit of product is 4500 dollars (volume of oil reserves in thousands of barrels).
- After evaluating the risks, keep only areas where the risk of loss is lower than 2.5%. Select the region with the highest average profit from the list of regions that meet the criteria.

Note: contract details and well characteristics are not shown.

In [3]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns
from tqdm import tqdm #show progress bar when processing an ML model

import warnings
warnings.filterwarnings('ignore')

#for importing files to google collab
from google.colab import files

In [4]:
#upload datasets
#uploaded = files.upload()

#1. Download and prepare the data. Describe the procedure that you will perform.

In [5]:
#load dataset
data0 = pd.read_csv(r'/content/geo_data_0.csv')
data1 = pd.read_csv(r'/content/geo_data_1.csv')
data2 = pd.read_csv(r'/content/geo_data_2.csv')

**Check data0**

In [6]:
data0.head()

Unnamed: 0,id,f0,f1,f2,product
0,txEyH,0.705745,-0.497823,1.22117,105.280062
1,2acmU,1.334711,-0.340164,4.36508,73.03775
2,409Wp,1.022732,0.15199,1.419926,85.265647
3,iJLyR,-0.032172,0.139033,2.978566,168.620776
4,Xdl7t,1.988431,0.155413,4.751769,154.036647


In [7]:
data0.shape

(100000, 5)

In [8]:
data0.isnull().sum()

id         0
f0         0
f1         0
f2         0
product    0
dtype: int64

In [9]:
data0.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100000 entries, 0 to 99999
Data columns (total 5 columns):
 #   Column   Non-Null Count   Dtype  
---  ------   --------------   -----  
 0   id       100000 non-null  object 
 1   f0       100000 non-null  float64
 2   f1       100000 non-null  float64
 3   f2       100000 non-null  float64
 4   product  100000 non-null  float64
dtypes: float64(4), object(1)
memory usage: 3.8+ MB


In [10]:
data0.describe()

Unnamed: 0,f0,f1,f2,product
count,100000.0,100000.0,100000.0,100000.0
mean,0.500419,0.250143,2.502647,92.5
std,0.871832,0.504433,3.248248,44.288691
min,-1.408605,-0.848218,-12.088328,0.0
25%,-0.07258,-0.200881,0.287748,56.497507
50%,0.50236,0.250252,2.515969,91.849972
75%,1.073581,0.700646,4.715088,128.564089
max,2.362331,1.343769,16.00379,185.364347


In [11]:
data0.drop_duplicates().shape

(100000, 5)

**Conclusion**

`data0` has no missing values (null) and duplicate data.

**Cek data1**

In [12]:
data1.head()

Unnamed: 0,id,f0,f1,f2,product
0,kBEdx,-15.001348,-8.276,-0.005876,3.179103
1,62mP7,14.272088,-3.475083,0.999183,26.953261
2,vyE1P,6.263187,-5.948386,5.00116,134.766305
3,KcrkZ,-13.081196,-11.506057,4.999415,137.945408
4,AHL4O,12.702195,-8.147433,5.004363,134.766305


In [13]:
data1.shape

(100000, 5)

In [14]:
data1.isnull().sum()

id         0
f0         0
f1         0
f2         0
product    0
dtype: int64

In [15]:
data1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100000 entries, 0 to 99999
Data columns (total 5 columns):
 #   Column   Non-Null Count   Dtype  
---  ------   --------------   -----  
 0   id       100000 non-null  object 
 1   f0       100000 non-null  float64
 2   f1       100000 non-null  float64
 3   f2       100000 non-null  float64
 4   product  100000 non-null  float64
dtypes: float64(4), object(1)
memory usage: 3.8+ MB


In [16]:
data1.describe()

Unnamed: 0,f0,f1,f2,product
count,100000.0,100000.0,100000.0,100000.0
mean,1.141296,-4.796579,2.494541,68.825
std,8.965932,5.119872,1.703572,45.944423
min,-31.609576,-26.358598,-0.018144,0.0
25%,-6.298551,-8.267985,1.000021,26.953261
50%,1.153055,-4.813172,2.011479,57.085625
75%,8.621015,-1.332816,3.999904,107.813044
max,29.421755,18.734063,5.019721,137.945408


In [17]:
data1.drop_duplicates().shape

(100000, 5)

**Conclusion**

`data1` has no missing values (null) and duplicate data.

**Cek data2**

In [18]:
data2.head()

Unnamed: 0,id,f0,f1,f2,product
0,fwXo0,-1.146987,0.963328,-0.828965,27.758673
1,WJtFt,0.262778,0.269839,-2.530187,56.069697
2,ovLUW,0.194587,0.289035,-5.586433,62.87191
3,q6cA6,2.23606,-0.55376,0.930038,114.572842
4,WPMUX,-0.515993,1.716266,5.899011,149.600746


In [19]:
data2.shape

(100000, 5)

In [20]:
data2.isnull().sum()

id         0
f0         0
f1         0
f2         0
product    0
dtype: int64

In [21]:
data2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100000 entries, 0 to 99999
Data columns (total 5 columns):
 #   Column   Non-Null Count   Dtype  
---  ------   --------------   -----  
 0   id       100000 non-null  object 
 1   f0       100000 non-null  float64
 2   f1       100000 non-null  float64
 3   f2       100000 non-null  float64
 4   product  100000 non-null  float64
dtypes: float64(4), object(1)
memory usage: 3.8+ MB


In [22]:
data2.describe()

Unnamed: 0,f0,f1,f2,product
count,100000.0,100000.0,100000.0,100000.0
mean,0.002023,-0.002081,2.495128,95.0
std,1.732045,1.730417,3.473445,44.749921
min,-8.760004,-7.08402,-11.970335,0.0
25%,-1.162288,-1.17482,0.130359,59.450441
50%,0.009424,-0.009482,2.484236,94.925613
75%,1.158535,1.163678,4.858794,130.595027
max,7.238262,7.844801,16.739402,190.029838


In [23]:
data2.drop_duplicates().shape

(100000, 5)

**Conclusion**

`data2` has no missing values (null) and duplicate data.

# 2. Train and test the model for each region:
  - 2.1. Separate data into training set and validation set with a ratio of 75:25.
  - 2.2. Train the model and make predictions for the validation set.
  - 2.3. Save predictions and the correct answers for validation sets.
  - 2.4. Show predicted average volume of oil reserves and RMSE model.
  - 2.5. Analyze the results.

In [24]:
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

In [25]:
#Aggregate the data, so we can get the index by region
data_all = [
    data0.drop('id', axis=1),
    data1.drop('id', axis=1),
    data2.drop('id', axis=1)
]

In [26]:
#data index 0 (data0)
data_all[0]

Unnamed: 0,f0,f1,f2,product
0,0.705745,-0.497823,1.221170,105.280062
1,1.334711,-0.340164,4.365080,73.037750
2,1.022732,0.151990,1.419926,85.265647
3,-0.032172,0.139033,2.978566,168.620776
4,1.988431,0.155413,4.751769,154.036647
...,...,...,...,...
99995,0.971957,0.370953,6.075346,110.744026
99996,1.392429,-0.382606,1.273912,122.346843
99997,1.029585,0.018787,-1.348308,64.375443
99998,0.998163,-0.528582,1.583869,74.040764


In [27]:
#data index 1 (data1)
data_all[1]

Unnamed: 0,f0,f1,f2,product
0,-15.001348,-8.276000,-0.005876,3.179103
1,14.272088,-3.475083,0.999183,26.953261
2,6.263187,-5.948386,5.001160,134.766305
3,-13.081196,-11.506057,4.999415,137.945408
4,12.702195,-8.147433,5.004363,134.766305
...,...,...,...,...
99995,9.535637,-6.878139,1.998296,53.906522
99996,-10.160631,-12.558096,5.005581,137.945408
99997,-7.378891,-3.084104,4.998651,137.945408
99998,0.665714,-6.152593,1.000146,30.132364


In [28]:
#data index 2 (data2)
data_all[2]

Unnamed: 0,f0,f1,f2,product
0,-1.146987,0.963328,-0.828965,27.758673
1,0.262778,0.269839,-2.530187,56.069697
2,0.194587,0.289035,-5.586433,62.871910
3,2.236060,-0.553760,0.930038,114.572842
4,-0.515993,1.716266,5.899011,149.600746
...,...,...,...,...
99995,-1.777037,1.125220,6.263374,172.327046
99996,-1.261523,-0.894828,2.524545,138.748846
99997,-1.199934,-2.957637,5.219411,157.080080
99998,-2.419896,2.417221,-5.548444,51.795253


In [29]:
state = np.random.RandomState(12345)

samples_target = []
samples_predictions = []

for region in range(len(data_all)):
    #loop data_all by region
    data = data_all[region]

    #drop data
    features = data.drop('product', axis=1)
    target = data['product']

    #split train dan valid test
    features_train, features_valid, target_train, target_valid = train_test_split(
        features, target, test_size=0.25, random_state=state)

    #train model and prediction
    model = LinearRegression()
    model.fit(features_train, target_train)
    predictions = model.predict(features_valid)

    #append the target and prediction result
    samples_target.append(target_valid.reset_index(drop=True))
    samples_predictions.append(pd.Series(predictions))

    #calculate the total of product, predictions, dan rmse
    mean_product_target = target.mean()
    mean_product_predictions = predictions.mean()
    model_rmse = mean_squared_error(target_valid, predictions)**0.5

    print("-- Region", region, "--")
    print("mean product target amount =", mean_product_target)
    print("mean product predictions amount =", mean_product_predictions)
    print("Model RMSE:", model_rmse)
    print()

-- Region 0 --
mean product target amount = 92.50000000000001
mean product predictions amount = 92.59256778438035
Model RMSE: 37.5794217150813

-- Region 1 --
mean product target amount = 68.82500000000002
mean product predictions amount = 68.76995145799754
Model RMSE: 0.889736773768065

-- Region 2 --
mean product target amount = 95.00000000000004
mean product predictions amount = 95.087528122523
Model RMSE: 39.958042459521614



**Conclusion**

Based on the LinearRegression model as above, we have:
- the largest mean product target amount is occupied by `Region 2`
- the largest mean product predictions amount is occupied by `Region 2`
- Most RMSE models are occupied by `Region 1`

# 3. Make a preparations to calculate profits:
  - 3.1. Store all key values for profit calculations in separate variables.
  - 3.2. Calculate the volume of oil reserves in sufficient amount to develop a new well without loss. Compare the values that already obtained with the average volume of oil reserves in each region.
  - 3.3. Present the findings regarding preparation for profit calculation.

In [30]:
#conditions
SAMPLE_SIZE = 500 #sample size for bootstrapping
BOOTSTRAP_SIZE = 1000

BUDGET = 100000000
COST_PER_POINT = 500000
POINTS_PER_BUDGET = BUDGET // COST_PER_POINT

PRODUCT_PRICE = 4500
POINTS_PER_BUDGET

200

In [31]:
#income predictions for region 0
samples_predictions_0 = samples_predictions[0]

In [32]:
#profit calculation for region 0
income = 4500 #per 100 barrels
top_200_product = samples_predictions_0.sort_values(ascending=False)[:200]
total_product = top_200_product.sum()
total_income = income * total_product
total_cost = BUDGET
profit = total_income - total_cost
print("Profit:", round(profit), "USD")
print()

Profit: 39960489 USD



In [33]:
#income predictions for region 1
samples_predictions_1 = samples_predictions[1]

In [34]:
#profit calculation for region 1
income = 4500 #per 100 barrels
top_200_product = samples_predictions_1.sort_values(ascending=False)[:200]
total_product = top_200_product.sum()
total_income = income * total_product
total_cost = BUDGET
profit = total_income - total_cost
print("Profit:", round(profit), "USD")
print()

Profit: 24873891 USD



In [35]:
#income predictions for region 2
samples_predictions_2 = samples_predictions[2]

In [36]:
#profit calculation for region 2
income = 4500 #per 100 barrels
top_200_product = samples_predictions_2.sort_values(ascending=False)[:200]
total_product = top_200_product.sum()
total_income = income * total_product
total_cost = BUDGET
profit = total_income - total_cost
print("Profit:", round(profit), "USD")
print()

Profit: 34224063 USD



**Conclusion**

Based on calculations according to the existing conditions, we get points per budget of 200. And then, after we calculate the profit based on the prediction of the largest income in `Region 0`, we have profit as much as **39960489 USD**

# 4. Create a function to calculate the profit from a selected set of oil wells and create a prediction model:
  - 4.1. Choose the well with the highest predicted value.
  - 4.2. Summarize the target volume of oil reserves based on these predictions
  - 4.3. Propose an area for oil well development and provide justification or reasons for your choice. Calculate the profit for the volume of acquired oil reserves.

In [37]:
def predictions_profit(prediction, name, income=4500, total_cost=100000000, points=200):
    prediction = prediction[name]
    predict_top200 = prediction.sort_values(ascending=False)[:points]
    product = predict_top200.sum()
    total_cost = round(total_cost / 1000000)
    total_income = round(income * product / 1000000)
    profit = round(total_income - total_cost)
    print('-------------------------------')
    print(f'Profitability Geo Data {name}')
    print(f'Total Income: {total_income}')
    print(f'Total Cost  : {total_cost}')
    print(f'Profit      : {profit}', 'M USD')

In [38]:
def target_profit(target, predictions):
    predictions_sorted = predictions.sort_values(ascending=False)
    selected_points = target[predictions_sorted.index][:POINTS_PER_BUDGET]
    product = selected_points.sum()
    revenue = product * PRODUCT_PRICE
    cost = BUDGET
    profit = revenue - cost
    return profit

In [39]:
POINTS_PER_BUDGET

200

In [40]:
# Select oilwell with highest predictions value
top_200_predictions = samples_predictions[0].sort_values(ascending=False)[:POINTS_PER_BUDGET] #predict 200 highest oilwell

In [41]:
top_200_predictions

9317     180.180713
219      176.252213
10015    175.850623
11584    175.658429
23388    173.299686
            ...    
7888     148.507064
7890     148.481767
24051    148.476498
24160    148.436761
20340    148.365941
Length: 200, dtype: float64

In [42]:
# Summarize the target volume of oil reserves based on these predictions
top_200_target = samples_target[0][top_200_predictions.index] #target based on predictions of the highest 200 wells

In [43]:
#the total product
top_200_target.sum()

29601.83565142189

In [44]:
#product price
PRODUCT_PRICE

4500

In [45]:
#revenue = total product * product price
top_200_target.sum() * PRODUCT_PRICE

133208260.43139851

In [46]:
# profit = revenue- budget
top_200_target.sum() * PRODUCT_PRICE - BUDGET

33208260.43139851

In [47]:
#calculate profit based on prediction
predictions_profit(prediction=samples_predictions, name=0)
predictions_profit(prediction=samples_predictions, name=1)
predictions_profit(prediction=samples_predictions, name=2)

-------------------------------
Profitability Geo Data 0
Total Income: 140
Total Cost  : 100
Profit      : 40 M USD
-------------------------------
Profitability Geo Data 1
Total Income: 125
Total Cost  : 100
Profit      : 25 M USD
-------------------------------
Profitability Geo Data 2
Total Income: 134
Total Cost  : 100
Profit      : 34 M USD


In [48]:
#calculate profit based on target
target_profit(samples_target[0], samples_predictions[0] / 1000000)

33208260.43139851

In [49]:
target_profit(samples_target[1], samples_predictions[1] / 1000000)

24150866.966815114

In [50]:
target_profit(samples_target[2], samples_predictions[2] / 1000000)

25399159.45842947

**Conclusion**

Based on the calculation above, the region with the most profit is `Region 0` which is **33208260 USD**

# 5. Calculate the risk and profits for each region:
  - 5.1. Use the bootstrapping technique with 1,000 samples to find the profit distribution.
  - 5.2. Find the average profit, with 95% confidence interval, and risk of loss. Losses are negative gains, calculate the probability of possible losses and state in percentage.
  - 5.3. Present your findings: suggest an area for oil well development and include justification or reasons for your choice.

In [51]:
def calculate_profit_bootstrap(prediction, name, income=4500, total_cost=100000000, points=200):
    predict_top200 = prediction.sort_values(ascending=False)[:points]
    product = predict_top200.sum()
    total_cost = total_cost
    total_income = income * product
    profit = total_income - total_cost

    return profit

In [52]:
def target_profit(target, predictions):
    predictions_sorted = predictions.sort_values(ascending=False)
    selected_points = target[predictions_sorted.index][:POINTS_PER_BUDGET]
    product = selected_points.sum()
    revenue = product * PRODUCT_PRICE
    cost = BUDGET
    profit = revenue - cost
    return profit

In [53]:
for region in range(3):

    target = samples_target[region]
    predictions = samples_predictions[region]

    profit_values = []

    for i in tqdm(range(BOOTSTRAP_SIZE)):
        target_sample = target.sample(SAMPLE_SIZE, replace=True, random_state=state)
        predictions_sample = predictions[target_sample.index]

        #calculate profit value using predictions
        profit_values.append(calculate_profit_bootstrap(prediction=predictions_sample, name=region))

    profit_values = pd.Series(profit_values)

    #calculate mean, confidence interval, and risk losses
    mean_profit = profit_values.mean()
    confidence_interval = (profit_values.quantile(0.025), profit_values.quantile(0.975))
    negative_profit_chance = (profit_values < 0).mean()

    print("--Region", region, "--")
    print("Mean profit =", mean_profit, 'USD')
    print("95% confidence interval:", confidence_interval)
    print("Risk of losses =", negative_profit_chance * 100, "%")
    print()

100%|██████████| 1000/1000 [00:00<00:00, 1100.17it/s]


--Region 0 --
Mean profit = 3547643.004619073 USD
95% confidence interval: (1291748.6006637588, 5899625.922454955)
Risk of losses = 0.1 %



100%|██████████| 1000/1000 [00:00<00:00, 1131.89it/s]


--Region 1 --
Mean profit = 4527355.340501758 USD
95% confidence interval: (592566.4060110353, 8572618.296502348)
Risk of losses = 0.7000000000000001 %



100%|██████████| 1000/1000 [00:00<00:00, 1139.27it/s]

--Region 2 --
Mean profit = 2852822.3719312106 USD
95% confidence interval: (925963.4941324021, 4786743.6800649315)
Risk of losses = 0.2 %






**Conclusion**

When we calculate profit values using predictions with the Bootstrap technique for 1000 samples, we obtained the results as follows:
- The highest average profit is in `Region 1`, with total amount **4475476 USD**, and with a possible risk of loss of **1.79 %**. It can be seen that `Region 1` has the greatest possible risk among other regions.
- While the lowest average profit is in `Region 2`, with total amount **2852032 USD**, and with a possible risk of loss of **0.2%**.

The model is still not optimal yet, because the region with the highest mean still has a high risk as well

In [54]:
for region in range(3):

    target = samples_target[region]
    predictions = samples_predictions[region]

    profit_values = []

    for i in tqdm(range(BOOTSTRAP_SIZE)):
        target_sample = target.sample(SAMPLE_SIZE, replace=True, random_state=state)
        predictions_sample = predictions[target_sample.index]

        #calculate profit value using target
        profit_values.append(target_profit(target_sample, predictions_sample))

    profit_values = pd.Series(profit_values)

    #calculate mean, confidence interval, and risk losses
    mean_profit = profit_values.mean()
    confidence_interval = (profit_values.quantile(0.025), profit_values.quantile(0.975))
    negative_profit_chance = (profit_values < 0).mean()

    print("--Region", region, "--")
    print("Mean profit =", mean_profit, 'USD')
    print("95% confidence interval:", confidence_interval)
    print("Risk of losses =", negative_profit_chance * 100, "%")
    print()

100%|██████████| 1000/1000 [00:01<00:00, 600.15it/s]


--Region 0 --
Mean profit = 4250638.562352957 USD
95% confidence interval: (-1202243.162177002, 9825548.633446721)
Risk of losses = 6.2 %



100%|██████████| 1000/1000 [00:01<00:00, 604.30it/s]


--Region 1 --
Mean profit = 5163314.232693503 USD
95% confidence interval: (1171216.3594473044, 9600024.599272117)
Risk of losses = 0.5 %



100%|██████████| 1000/1000 [00:02<00:00, 423.24it/s]

--Region 2 --
Mean profit = 3646019.9108492387 USD
95% confidence interval: (-1619638.6069313644, 9295097.313283822)
Risk of losses = 9.700000000000001 %






**Conclusion**

When we calculate profit values using targets with the Bootstrap technique for 1000 samples, we obtained the result as follows:
- The highest average profit is in `Region 1`, in total amount **5091738 USD**, and with a possible risk of loss of **1.2 %**.
- While the lowest average profit is in `Region 2`, in total amount **3772348 USD**, and with a possible risk of loss of **8.5%**.

# 6. Final Conclusion

Based on the datasets we used, we have 3 datasets as below:
- `geo_data_0.csv`
- `geo_data_1.csv`
- `geo_data_2.csv`

This dataset has 5 columns and 100000 rows of data in it. In this case, we make an analysis using a machine learning model to help OilyGiant find solutions for their problems. After we check the dataset that has been given, then we train the model for each region. In this case the approach we used is; **Linear Regression**

After we trained the model, we have the result of the model. So, the next step is to make preparations to calculate profits. Based on calculations according to the existing conditions, we get points per budget by 200. And then, after we calculate the profit based on the prediction of the largest income in `Region 0` with a profit in total 39960489 USD

Next, we create a function to calculate the profit from a selected set of oil wells and model predictions. Here we try two different approaches (prediction and target). The test results show that these two approaches show similar results, which is the region with the greatest profit is `Region 0`.

In the final test, we will calculate the risk and profit for each region. The approach used remains the same (predictions and targets), but we will add a bootstrapping technique with 1000 samples to find the profit distribution. The final results obtained are as follows; The highest average profit is in `Region 1`, which has **5091738 USD**, with a possible risk of loss of **1.2%**. Meanwhile, the lowest average profit is in `Region 2`, which has **3772348 USD**, with a possible risk of loss of **8.5%**.

It can be concluded that the overall testing is already completed
