<h1 align="center"> Project Setup </h1>

#### Importing Dependencies

In [1]:
import os
import sys

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.cluster import KMeans

#### Importing Datasets

In [2]:
# Display all columns in the DataFrame using pandas settings
pd.set_option('display.max_columns', None)

In [3]:
df_imports = pd.read_excel('data/WtoData_worldwide_import_from_2010_to_2022_all_countries.xlsx')

<br><br>

<h1 align="center"> Summary</h1>

### Objective

This project focuses on how the country will be able to know the right market to export their products. This will allow the country to get a good guide on foreign trade, and know which countries have High demand in different sectors, especially agricultural products.

The specific objectives of this project are:
1. Analyzethedatasetthatwearegoingtocollecttobetterunderstandandgive
some insight on the international trade
2. ChoosethebestplacewiththeHighdemandforagriculturalproductssothat Haiti can promote and sell its products

<hr>

### Hypothesis: Research Question?

What is the question that you would like to answer in order to make a decision.

<hr>

### Data Source

For this project we will be using the World Trade Organization(WTO) data portal to have access to. The WTO Data portal contains statistical indicators, Available time series cover merchandise trade and trade in services statistics, market access indicators (bound, applied & preferential tariffs), non-tariff information as well as other indicators.

<br><br>

<h1 align="center"> Data Cleaning </h1>

#### Data Overview

In [4]:
# Displaying the countries import product dataframe
print('------ Imports Dataset ------')
display(df_imports.head(4))

------ Imports Dataset ------


Unnamed: 0,Indicator,Merchandise imports by product group – annual (Million US dollar),Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10,Unnamed: 11,Unnamed: 12,Unnamed: 13
0,,,,,,,,,,,,,,
1,Reporting Economy,Product/Sector,Partner Economy,2010.0,2011.0,2012.0,2013.0,2014.0,2015.0,2016.0,2017.0,2018.0,2019.0,2020.0
2,World,SI3_AGG - TO - Total merchandise,World,15438092.0,18438364.0,18657296.0,18966119.0,19060809.0,16733507.0,16211194.0,17985896.0,19836342.0,19284167.0,17812107.0
3,World,SI3_AGG - AG - Agricultural products,World,1391529.0,1701893.0,1681611.0,1756648.0,1798003.0,1594510.0,1599989.0,1759978.0,1851346.0,1823105.0,


In [5]:
print('------------------------------ Dataset Shape ------------------------------')
print('The Imports dataset has',df_imports.shape[0], 'Rows and', df_imports.shape[1],'columns')

print('------------------------------ Dataframe Columns ------------------------------')
display(df_imports.columns)

print('------------------------------ Data types ------------------------------')
display(df_imports.dtypes)

------------------------------ Dataset Shape ------------------------------
The Imports dataset has 3287 Rows and 14 columns
------------------------------ Dataframe Columns ------------------------------


Index(['Indicator',
       '  Merchandise imports by product group – annual (Million US dollar)',
       'Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4', 'Unnamed: 5', 'Unnamed: 6',
       'Unnamed: 7', 'Unnamed: 8', 'Unnamed: 9', 'Unnamed: 10', 'Unnamed: 11',
       'Unnamed: 12', 'Unnamed: 13'],
      dtype='object')

------------------------------ Data types ------------------------------


Indicator                                                               object
  Merchandise imports by product group – annual (Million US dollar)     object
Unnamed: 2                                                              object
Unnamed: 3                                                             float64
Unnamed: 4                                                             float64
Unnamed: 5                                                             float64
Unnamed: 6                                                             float64
Unnamed: 7                                                             float64
Unnamed: 8                                                             float64
Unnamed: 9                                                             float64
Unnamed: 10                                                            float64
Unnamed: 11                                                            float64
Unnamed: 12                                         

In [6]:
# Display all columns
print('-------- Imports column names --------')
display(df_imports.columns)

# Change columns name
map_cols_name = {
    'Indicator':'Reporting Economy',
    '  Merchandise imports by product group – annual (Million US dollar)': 'Product/Sector',
    'Unnamed: 2': 'Partner Economy',
    'Unnamed: 3': '2010',
    'Unnamed: 4': '2011',
    'Unnamed: 5': '2012',
    'Unnamed: 6': '2013',
    'Unnamed: 7': '2014',
    'Unnamed: 8': '2015',
    'Unnamed: 9': '2016',
    'Unnamed: 10': '2017',
    'Unnamed: 11': '2018',
    'Unnamed: 12': '2019',
    'Unnamed: 13': '2020',
}

# Change all default column names
renamed_cols_df = df_imports.rename(columns=map_cols_name)

print('-------- Imports Dataframe with new cols names --------')
display(renamed_cols_df.head(5))

-------- Imports column names --------


Index(['Indicator',
       '  Merchandise imports by product group – annual (Million US dollar)',
       'Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4', 'Unnamed: 5', 'Unnamed: 6',
       'Unnamed: 7', 'Unnamed: 8', 'Unnamed: 9', 'Unnamed: 10', 'Unnamed: 11',
       'Unnamed: 12', 'Unnamed: 13'],
      dtype='object')

-------- Imports Dataframe with new cols names --------


Unnamed: 0,Reporting Economy,Product/Sector,Partner Economy,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020
0,,,,,,,,,,,,,,
1,Reporting Economy,Product/Sector,Partner Economy,2010.0,2011.0,2012.0,2013.0,2014.0,2015.0,2016.0,2017.0,2018.0,2019.0,2020.0
2,World,SI3_AGG - TO - Total merchandise,World,15438092.0,18438364.0,18657296.0,18966119.0,19060809.0,16733507.0,16211194.0,17985896.0,19836342.0,19284167.0,17812107.0
3,World,SI3_AGG - AG - Agricultural products,World,1391529.0,1701893.0,1681611.0,1756648.0,1798003.0,1594510.0,1599989.0,1759978.0,1851346.0,1823105.0,
4,World,SI3_AGG - AGFO - Food,World,1144082.0,1387912.0,1391766.0,1465158.0,1510854.0,1340065.0,1354624.0,1482990.0,1551785.0,1547817.0,


In [7]:
# Remove the first 2 rows in the imports dataframe
to_drop_rows = renamed_cols_df.index[:2]

dropped_rows_df = renamed_cols_df.drop(to_drop_rows).reset_index(drop=True)

display(dropped_rows_df.head())

Unnamed: 0,Reporting Economy,Product/Sector,Partner Economy,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020
0,World,SI3_AGG - TO - Total merchandise,World,15438092.0,18438364.0,18657296.0,18966119.0,19060809.0,16733507.0,16211194.0,17985896.0,19836342.0,19284167.0,17812107.0
1,World,SI3_AGG - AG - Agricultural products,World,1391529.0,1701893.0,1681611.0,1756648.0,1798003.0,1594510.0,1599989.0,1759978.0,1851346.0,1823105.0,
2,World,SI3_AGG - AGFO - Food,World,1144082.0,1387912.0,1391766.0,1465158.0,1510854.0,1340065.0,1354624.0,1482990.0,1551785.0,1547817.0,
3,World,SI3_AGG - MI - Fuels and mining products,World,3183852.0,4278365.0,4227403.0,4208130.0,3874299.0,2489603.0,2124334.0,2755203.0,3408176.0,3168025.0,
4,World,SI3_AGG - MIFU - Fuels,World,2462969.0,3361172.0,3401742.0,3403308.0,3091267.0,1859835.0,1528358.0,2006039.0,2570707.0,2350244.0,


In [8]:
print('------------ Display values in the Product/Sectors on the Import dataset columns ------------')
display(dropped_rows_df['Product/Sector'].value_counts())

# Remove all non use coverage CODE before the last hyphen in all value in the "Product/sector" column
dropped_rows_df['Product/Sector'] = dropped_rows_df['Product/Sector'].apply(lambda x: x.split('- ')[-1])

# Also drop the ['Partner Economy']columns
df_final = dropped_rows_df.drop(['Partner Economy'],axis=1)

print('---- Final import dataframe ----')
display(df_final.head())

------------ Display values in the Product/Sectors on the Import dataset columns ------------


SI3_AGG - TO - Total merchandise                                        207
SI3_AGG - MI - Fuels and mining products                                182
SI3_AGG - AGFO - Food                                                   182
SI3_AGG - MACH - Chemicals                                              182
SI3_AGG - MIFU - Fuels                                                  182
SI3_AGG - MA - Manufactures                                             182
SI3_AGG - MAMT - Machinery and transport equipment                      182
SI3_AGG - AG - Agricultural products                                    182
SI3_AGG - MAMTAU - Automotive products                                  181
SI3_AGG - MAIS - Iron and steel                                         181
SI3_AGG - MAMTOTTL - Telecommunications equipment                       181
SI3_AGG - MACL - Clothing                                               181
SI3_AGG - MAMTTE - Transport equipment                                  181
SI3_AGG - MA

---- Final import dataframe ----


Unnamed: 0,Reporting Economy,Product/Sector,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020
0,World,Total merchandise,15438092.0,18438364.0,18657296.0,18966119.0,19060809.0,16733507.0,16211194.0,17985896.0,19836342.0,19284167.0,17812107.0
1,World,Agricultural products,1391529.0,1701893.0,1681611.0,1756648.0,1798003.0,1594510.0,1599989.0,1759978.0,1851346.0,1823105.0,
2,World,Food,1144082.0,1387912.0,1391766.0,1465158.0,1510854.0,1340065.0,1354624.0,1482990.0,1551785.0,1547817.0,
3,World,Fuels and mining products,3183852.0,4278365.0,4227403.0,4208130.0,3874299.0,2489603.0,2124334.0,2755203.0,3408176.0,3168025.0,
4,World,Fuels,2462969.0,3361172.0,3401742.0,3403308.0,3091267.0,1859835.0,1528358.0,2006039.0,2570707.0,2350244.0,


#### Analytical Transformations

In [9]:
# Perform any transformation on the columns in the dataset to enable further analysis.
df_melt = df_final.copy()

In [10]:
# Melt the Dataframe
df_melt = pd.melt(frame=df_final, 
                  id_vars=['Product/Sector','Reporting Economy'], 
                  var_name='Year', 
                  value_name="Million US dollar")

# Reshape the dataframe using pivot_table
reshape = df_melt.pivot_table(columns='Product/Sector',
                             index=['Year','Reporting Economy'],
                             values="Million US dollar")

# Reset the index to have a beautifull dataframe
reshape = reshape.reset_index()

# Remove Index name
reshape = reshape.rename_axis(None, axis=1)

# Change columns position
print('------------------ Display all columns ------------------')
display(reshape.columns)
print('------------------ Swap columns ------------------')
df_reshape = reshape[['Year','Reporting Economy','Agricultural products',
                   'Automotive products','Chemicals','Clothing',
                   'Electronic data processing and office equipment','Food','Fuels',
                   'Fuels and mining products','Integrated circuits and electronic components',
                  'Machinery and transport equipment','Manufactures','Office and telecom equipment',
                   'Pharmaceuticals','Telecommunications equipment','Textiles',
                   'Transport equipment','Total merchandise']]

# Overview our dataframe
df_reshape.head()

------------------ Display all columns ------------------


Index(['Year', 'Reporting Economy', 'Agricultural products',
       'Automotive products', 'Chemicals', 'Clothing',
       'Electronic data processing and office equipment', 'Food', 'Fuels',
       'Fuels and mining products',
       'Integrated circuits and electronic components', 'Iron and steel',
       'Machinery and transport equipment', 'Manufactures',
       'Office and telecom equipment', 'Pharmaceuticals',
       'Telecommunications equipment', 'Textiles', 'Total merchandise',
       'Transport equipment'],
      dtype='object')

------------------ Swap columns ------------------


Unnamed: 0,Year,Reporting Economy,Agricultural products,Automotive products,Chemicals,Clothing,Electronic data processing and office equipment,Food,Fuels,Fuels and mining products,Integrated circuits and electronic components,Machinery and transport equipment,Manufactures,Office and telecom equipment,Pharmaceuticals,Telecommunications equipment,Textiles,Transport equipment,Total merchandise
0,2010,Afghanistan,706.0,193.0,82.0,12.0,,706.0,1075.0,1090.0,,339.0,984.0,19.0,,19.0,118.0,197.0,5154.0
1,2010,Albania,872.0,217.0,464.0,173.0,63.0,826.0,635.0,802.0,14.0,875.0,2731.0,176.0,155.0,99.0,168.0,241.0,4406.0
2,2010,Algeria,7350.0,3981.0,4452.0,183.0,520.0,6683.0,867.0,1493.0,96.0,16716.0,31367.0,1215.0,1719.0,599.0,351.0,5108.0,40473.0
3,2010,Angola,2882.0,1204.0,963.0,127.0,174.0,2764.0,3105.0,3233.0,19.0,6475.0,10521.0,561.0,148.0,369.0,108.0,2090.0,16667.0
4,2010,Antigua and Barbuda,113.0,19.0,33.0,8.0,7.0,107.0,3.0,6.0,1.0,99.0,242.0,15.0,9.0,8.0,12.0,51.0,501.0


#### Treatment of Missing Values

In [11]:
# Checking for missing values in the import dataset
df_reshape.isnull().sum()

Year                                                 0
Reporting Economy                                    0
Agricultural products                              508
Automotive products                                518
Chemicals                                          506
Clothing                                           518
Electronic data processing and office equipment    530
Food                                               506
Fuels                                              506
Fuels and mining products                          506
Integrated circuits and electronic components      532
Machinery and transport equipment                  506
Manufactures                                       508
Office and telecom equipment                       518
Pharmaceuticals                                    532
Telecommunications equipment                       518
Textiles                                           520
Transport equipment                                518
Total merc

In [12]:
# check if we don't have Duplicated values in the dataframe
print(df_reshape.duplicated().sum(), 'value')

0 value


In [13]:
# Drop Rows Only With NaN Values for All Columns
df_reshape_drop_all = df_reshape.dropna(thresh=4)

In [14]:
# Dealing with missing single value with the fillna function
df_reshape_fill_na = df_reshape_drop_all.fillna(0)

In [15]:
print('---- World import data ----')
display(df_reshape_fill_na.isna().sum())


print('---------------')
display('Now all of our empty values have been successfully filled with 0')

---- World import data ----


Year                                               0
Reporting Economy                                  0
Agricultural products                              0
Automotive products                                0
Chemicals                                          0
Clothing                                           0
Electronic data processing and office equipment    0
Food                                               0
Fuels                                              0
Fuels and mining products                          0
Integrated circuits and electronic components      0
Machinery and transport equipment                  0
Manufactures                                       0
Office and telecom equipment                       0
Pharmaceuticals                                    0
Telecommunications equipment                       0
Textiles                                           0
Transport equipment                                0
Total merchandise                             

---------------


'Now all of our empty values have been successfully filled with 0'

In [16]:
# Change data type Float --> Int
print('-------- old data type --------')
display(df_reshape_fill_na.dtypes)


# Change Float type to Int (From the "Agricultural products" to "Total merchandise" columns)
df_reshape_fill_na.iloc[:,2:] = df_reshape_fill_na.iloc[:,2:].astype(int)

# Dataframe final
df = df_reshape_fill_na

print('-------- New import data type --------')
display(df.dtypes)

-------- old data type --------


Year                                                object
Reporting Economy                                   object
Agricultural products                              float64
Automotive products                                float64
Chemicals                                          float64
Clothing                                           float64
Electronic data processing and office equipment    float64
Food                                               float64
Fuels                                              float64
Fuels and mining products                          float64
Integrated circuits and electronic components      float64
Machinery and transport equipment                  float64
Manufactures                                       float64
Office and telecom equipment                       float64
Pharmaceuticals                                    float64
Telecommunications equipment                       float64
Textiles                                           float

-------- New import data type --------


Year                                               object
Reporting Economy                                  object
Agricultural products                               int64
Automotive products                                 int64
Chemicals                                           int64
Clothing                                            int64
Electronic data processing and office equipment     int64
Food                                                int64
Fuels                                               int64
Fuels and mining products                           int64
Integrated circuits and electronic components       int64
Machinery and transport equipment                   int64
Manufactures                                        int64
Office and telecom equipment                        int64
Pharmaceuticals                                     int64
Telecommunications equipment                        int64
Textiles                                            int64
Transport equi

In [33]:
print('-------- Display the final dataframe --------')
display(df)

print('-------- Export the final dataframe --------')
file_name = 'final_dataframe_export.xlsx'
df.to_excel(f'output/data/{file_name}', index=False)
print('DataFrame is written to Excel File successfully...')

-------- Display the final dataframe --------


Unnamed: 0,Year,Reporting Economy,Agricultural products,Automotive products,Chemicals,Clothing,Electronic data processing and office equipment,Food,Fuels,Fuels and mining products,Integrated circuits and electronic components,Machinery and transport equipment,Manufactures,Office and telecom equipment,Pharmaceuticals,Telecommunications equipment,Textiles,Transport equipment,Total merchandise
0,2010,Afghanistan,706,193,82,12,0,706,1075,1090,0,339,984,19,0,19,118,197,5154
1,2010,Albania,872,217,464,173,63,826,635,802,14,875,2731,176,155,99,168,241,4406
2,2010,Algeria,7350,3981,4452,183,520,6683,867,1493,96,16716,31367,1215,1719,599,351,5108,40473
3,2010,Angola,2882,1204,963,127,174,2764,3105,3233,19,6475,10521,561,148,369,108,2090,16667
4,2010,Antigua and Barbuda,113,19,33,8,7,107,3,6,1,99,242,15,9,8,12,51,501
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2023,2019,Viet Nam,26023,6345,29675,932,20612,19093,15529,25353,36323,108747,198303,63107,3722,6172,17284,8538,253393
2025,2019,World,1823105,1527051,2298862,536394,678319,1547817,2350244,3168025,882070,7115626,13740761,2253850,707517,693462,335278,2281774,19284167
2026,2019,Yemen,1871,182,331,60,10,1844,1340,1363,71,467,1482,111,128,29,71,211,10407
2027,2019,Zambia,570,508,1436,71,82,519,1256,1624,15,2185,4998,221,211,124,51,636,7180


-------- Export the final dataframe --------
DataFrame is written to Excel File successfully...


# <h1 align="center"> Data Analysis </h1>

#### Descriptive Statistical Analysis

In [54]:
# basic statistical measures such as measurements of central tendancy such as mean, median and mode.
print('------ Average ------')
display(df.mean())

print('------ Median ------')
display(df.median())

print('------ Skewness ------')
display(df.skew())

print('------ Max ------')
display(df.max())

print('------ Standard Deviation ------')
display(df.std())

------ Average ------


Year                                                        inf
Agricultural products                              1.935470e+04
Automotive products                                1.575976e+04
Chemicals                                          2.359978e+04
Clothing                                           5.134750e+03
Electronic data processing and office equipment    6.944069e+03
Food                                               1.617699e+04
Fuels                                              2.904928e+04
Fuels and mining products                          3.773991e+04
Integrated circuits and electronic components      8.367180e+03
Machinery and transport equipment                  7.241048e+04
Manufactures                                       1.403531e+05
Office and telecom equipment                       2.400794e+04
Pharmaceuticals                                    6.586491e+03
Telecommunications equipment                       8.702370e+03
Textiles                                

------ Median ------


Year                                                2014.0
Agricultural products                               1916.5
Automotive products                                  881.0
Chemicals                                           1590.5
Clothing                                             194.5
Electronic data processing and office equipment      169.5
Food                                                1720.0
Fuels                                               1636.0
Fuels and mining products                           1978.5
Integrated circuits and electronic components         28.0
Machinery and transport equipment                   3639.5
Manufactures                                        8160.5
Office and telecom equipment                         561.5
Pharmaceuticals                                      350.5
Telecommunications equipment                         358.0
Textiles                                             274.5
Transport equipment                                 1359

------ Skewness ------


Year                                                0.006188
Agricultural products                              12.583732
Automotive products                                12.200013
Chemicals                                          12.529616
Clothing                                           12.282745
Electronic data processing and office equipment    11.972217
Food                                               12.651310
Fuels                                              13.345392
Fuels and mining products                          13.016992
Integrated circuits and electronic components      11.032169
Machinery and transport equipment                  12.363089
Manufactures                                       12.454768
Office and telecom equipment                       11.769722
Pharmaceuticals                                    12.436179
Telecommunications equipment                       11.888992
Textiles                                           12.766971
Transport equipment     

------ Max ------


Year                                                   2019
Reporting Economy                                  Zimbabwe
Agricultural products                               1851346
Automotive products                                 1556336
Chemicals                                           2337456
Clothing                                             536766
Electronic data processing and office equipment      703303
Food                                                1551785
Fuels                                               3403308
Fuels and mining products                           4278365
Integrated circuits and electronic components        901048
Machinery and transport equipment                   7275483
Manufactures                                       13985429
Office and telecom equipment                        2328299
Pharmaceuticals                                      707517
Telecommunications equipment                         779213
Textiles                                

------ Standard Deviation ------


Agricultural products                              1.302534e+05
Automotive products                                1.074526e+05
Chemicals                                          1.585806e+05
Clothing                                           3.700228e+04
Electronic data processing and office equipment    4.603830e+04
Food                                               1.087722e+05
Fuels                                              2.067346e+05
Fuels and mining products                          2.662567e+05
Integrated circuits and electronic components      5.799008e+04
Machinery and transport equipment                  4.911520e+05
Manufactures                                       9.575637e+05
Office and telecom equipment                       1.578861e+05
Pharmaceuticals                                    4.460529e+04
Telecommunications equipment                       5.645024e+04
Textiles                                           2.400630e+04
Transport equipment                     

In [55]:
# Describe the dataset
df.describe()

Unnamed: 0,Agricultural products,Automotive products,Chemicals,Clothing,Electronic data processing and office equipment,Food,Fuels,Fuels and mining products,Integrated circuits and electronic components,Machinery and transport equipment,Manufactures,Office and telecom equipment,Pharmaceuticals,Telecommunications equipment,Textiles,Transport equipment,Total merchandise
count,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0,1730.0
mean,19354.7,15759.76,23599.78,5134.750289,6944.068786,16176.99,29049.28,37739.91,8367.179769,72410.48,140353.1,24007.94,6586.491329,8702.369942,3285.883815,22791.97,205268.7
std,130253.4,107452.6,158580.6,37002.275337,46038.300211,108772.2,206734.6,266256.7,57990.077252,491152.0,957563.7,157886.1,44605.288479,56450.236875,24006.296911,161415.5,1392447.0
min,0.0,0.0,0.0,0.0,0.0,3.0,1.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,21.0
25%,562.0,161.5,310.5,29.0,31.0,535.0,384.5,503.5,5.0,780.0,1883.75,103.25,92.25,58.25,46.0,259.5,3386.0
50%,1916.5,881.0,1590.5,194.5,169.5,1720.0,1636.0,1978.5,28.0,3639.5,8160.5,561.5,350.5,358.0,274.5,1359.5,12368.5
75%,8032.0,4839.75,8524.5,983.0,1523.0,6760.25,8145.25,10627.0,451.0,22003.0,45165.0,5088.5,2042.5,2412.5,1292.75,7285.25,66765.75
max,1851346.0,1556336.0,2337456.0,536766.0,703303.0,1551785.0,3403308.0,4278365.0,901048.0,7275483.0,13985430.0,2328299.0,707517.0,779213.0,343479.0,2339681.0,19836340.0


#### Distribution of Variables

In [21]:
# Identify the distribution of the data to understand the range of values and how the data is structured.

#### Outliers in the dataset

In [22]:
# Identify if there are any outliers in the dataset based on statistical measures.

<br><br>

<h1 align="center"> Reflections </h1>

#### Summary of Data Analysis

In [23]:
# What insights should the user takeaway from EDA.

#### Questions unanswered

In [24]:
# What aspects of the research question were we unable to answer and why?

#### Recommendations

In [25]:
# Identify if there are any outliers in the dataset based on statistical measures.

#### Next Steps

In [26]:
# What will the analyst do next based on the analysis?