# Trade Data Visualization and Analysis

- Itighne Sinha 
- Roll Number - 055015


### Description

-   This dataset provides detailed information on international trade transactions, capturing both import and export activities. It includes comprehensive data on various aspects of trade, making it a valuable resource for business analysis, economic research, and financial modeling.
-   Type : Panel
-   Source: https://www.kaggle.com/datasets/chakilamvishwas/imports-exports-15000. (2.69 MB)
-   Dimensions : (Observation,Variable) - (15,000,16)
-   Variable Category:
    1. Index_variable = Transaction ID, Date, Invoice Number, Customs Code.
    2. Non_categorical_variable = Quantity, Value, Weight.
    3. Categorical_nominal_variable = Country, Product, Shipping Method, Supplier, Customer, Import Export, Port, Category.
    4. Categorical_ordinal_variable = Payment Terms
       

- Data Variable Info:
    1. Text: 10
    2. Number: 5
    3. Date: 1

## FINDINGS:

1. There is currently a net loss due to the higher value of imports over exports, managerial focus should be on gaining a stronghold and converting it into a profit by increasing exports.
2. The countries to focus on as big players are Congo, Switzerland and Poland.
3. There is a significant difference in the countries that are the absolute trade leaders and the countries with a higher average transaction value, those countries should be targeted and focused on.
4. On average March is the peak month of the year, whereas July is when trade dips across the world.
5. Our Most valuable product category is Clothing, based on both, absolute and average transaction values. This can be converted into a cash cow by further improving quantity and gaining a higher market share.
6. For goods shipped by air, the most common payment term is Cash on Delivery
7. For goods shipped by sea, the most common payment term is Net 30
8. For goods shipped by sea, the most common payment term is Net 60.
9. There is minute difference in average Transaction Value across the three shipping methods.
10. The biggest players trade in all 5 product categories overall.
11. There is very low degree of positive correlation between weight and cost of transactions.
13. In general, Cash on Delivery has the highest average order value whereas Net 30 has the lowest.

## Managerial Insights

##### Key Areas of Focus
- Export Growth: Given the current net loss, prioritize increasing exports to achieve profitability.
- Target Markets: Focus on Congo, Switzerland, and Poland as key markets with significant potential.
- Seasonal Trade Patterns: Align operations and inventory levels with peak trade months (March) and dip months (July).
- Clothing Category: Capitalize on the clothing category's high value and potential for growth. Increase quantity and market share.
- Payment Terms: Utilize Cash on Delivery for air shipments and Net 30 or Net 60 for sea shipments, considering the respective payment terms' prevalence.
- Shipping Methods: Evaluate the cost-effectiveness of different shipping methods based on transaction value and payment terms.
- Product Diversification: Explore opportunities to expand into other product categories to reduce reliance on the clothing category and diversify revenue streams.
- Customer Segmentation: Segment customers based on payment preferences and tailor credit policies accordingly.

##### Specific Recommendations
- Export Strategy: Develop a comprehensive export strategy that includes market research, product adaptation, pricing, and distribution channels.
- Product Development: Invest in research and development to create innovative and competitive products that meet the needs of target markets.
- Negotiation Skills: Enhance negotiation skills to secure favorable terms with suppliers, customers, and logistics providers.
- Risk Management: Implement strategies to mitigate risks associated with international trade, such as currency fluctuations, political instability, and supply chain disruptions.   
- Technology Adoption: Leverage technology to improve supply chain efficiency, inventory management, and customer service.

In [3]:
import streamlit as st
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
import plotly.express as px
import seaborn as sns

#DATA PREPARATION
trade=pd.read_csv(r"C:\Users\itigh\Streamlit\Imports_Exports_Dataset.csv")
samtrade = trade.sample(n=3001, random_state=55015)
samtrade.head()
samtrade.info()
st.title('Import Export Database Dashboard')


##############################################
#1
total_import_value = samtrade[samtrade['Import_Export'] == 'Import']['Value'].sum()
total_export_value = samtrade[samtrade['Import_Export'] == 'Export']['Value'].sum()
absolute_difference = abs(total_import_value - total_export_value)
data = {
    'Type': ['Import', 'Export'],
    'Value': [total_import_value, total_export_value]
}
difference_data = pd.DataFrame(data)
plt.figure(figsize=(10, 6))
bars = plt.barh(difference_data['Type'], difference_data['Value'], color=['orange', 'blue'])
plt.title('Total Import and Export Values', fontsize=14)
plt.xlabel('Value ($)', fontsize=14)
plt.ylabel('Type', fontsize=12)
plt.xticks(rotation=45)
max_value = max(difference_data['Value'])
plt.xlim(5000000, max_value * 1.1)  # 10% more than max value for better spacing
plt.gca().xaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))
plt.annotate(f'Net Loss in trade: ${absolute_difference:,.2f}', 
             xy=(max_value * .985, 1),  # Position it slightly offset to the right
             xytext=(10, 0),
             textcoords='offset points', 
             fontsize=8, 
             color='red', 
             ha='left')
st.pyplot(plt.gcf())
plt.close()

#######################################

#2
samtrade['Date'] = pd.to_datetime(samtrade['Date'], dayfirst=True, errors='coerce')
samtrade.set_index('Date', inplace=True)
monthly_data = samtrade.groupby([pd.Grouper(freq='ME'), 'Import_Export'])['Value'].sum().unstack()
monthly_data.reset_index(inplace=True)
monthly_data['Year'] = pd.to_datetime(monthly_data['Date']).dt.year
export_data_yearly = monthly_data.groupby('Year')['Export'].sum().reset_index()
import_data_yearly = monthly_data.groupby('Year')['Import'].sum().reset_index()

def currency_format(x, pos):
    return '${:,.0f}'.format(x)

max_export_year = export_data_yearly['Export'].idxmax()
max_import_year = import_data_yearly['Import'].idxmax()
plt.figure(figsize=(10, 6))
bars_export = plt.bar(export_data_yearly['Year'], export_data_yearly['Export'], color='blue')
bars_export[max_export_year].set_color('red')
plt.title('Total Cumulative Export Value by Year', fontsize=16)
plt.xlabel('Year', fontsize=14)
plt.ylabel('Total Export Value', fontsize=14)
plt.xticks(rotation=45)
plt.gca().yaxis.set_major_formatter(FuncFormatter(currency_format))
plt.grid(False)
plt.tight_layout()
st.pyplot(plt.gcf()) 
plt.figure(figsize=(10, 6))
bars_import = plt.bar(import_data_yearly['Year'], import_data_yearly['Import'], color='orange')
bars_import[max_import_year].set_color('red')
plt.title('Total Cumulative Import Value by Year', fontsize=16)
plt.xlabel('Year', fontsize=14)
plt.ylabel('Total Import Value', fontsize=14)
plt.xticks(rotation=45)
plt.gca().yaxis.set_major_formatter(FuncFormatter(currency_format))
plt.grid(False)
plt.tight_layout()
st.pyplot(plt.gcf())

################################


#3 Trade by category
def currency_format(x, pos):
    return '${:,.0f}'.format(x)
trade_type = st.selectbox("Select Trade Type", options=["Import", "Export"])
filtered_data = samtrade[samtrade['Import_Export'] == trade_type]
category_counts = filtered_data.groupby('Category')['Value'].sum()
colors = plt.cm.viridis(np.linspace(0, 1, len(category_counts)))
fig, ax = plt.subplots()
ax.bar(category_counts.index, category_counts.values, color=colors)
ax.set_xlabel('Category')  
ax.set_ylabel('Transactions')
ax.yaxis.set_major_formatter(FuncFormatter(currency_format))
st.title("Category-wise Trade")
st.pyplot(fig)

########################################################################


#4
category_counts = samtrade.groupby('Category')['Value'].sum()
fig3, ax4 = plt.subplots()
ax4.pie(category_counts, labels=category_counts.index, autopct='%1.1f%%', startangle=90)
ax4.axis('equal')
st.title(f"{trade_type} Category-wise Trade Distribution")
st.pyplot(fig3)

########################################################################

#5
country_agg = samtrade.groupby('Country').agg({
    'Quantity': 'count',
    'Value': 'sum'
}).reset_index()
top_countries = country_agg.nlargest(10, 'Value')

fig_map = px.choropleth(
    top_countries,
    locations='Country',
    locationmode='country names',
    color='Value',
    hover_name='Country',
    title='Top 10 Countries by Overall Trade Value',
    color_continuous_scale='Viridis',
    labels={'Value': 'Total Trade Value'},
)
st.title("Top 10 Countries in Terms of Overall Trade Value")
st.plotly_chart(fig_map)
top_countries = country_agg.nlargest(10, 'Value')
import_data = samtrade[samtrade['Import_Export'] == 'Import'].groupby('Country')['Value'].sum()
export_data = samtrade[samtrade['Import_Export'] == 'Export'].groupby('Country')['Value'].sum()
top_import_export = pd.DataFrame({
    'Import': import_data,
    'Export': export_data
}).reindex(top_countries['Country']) 
fig, ax = plt.subplots(figsize=(12, 6))
bar_width = 0.35
r1 = np.arange(len(top_import_export))
r2 = r1 + bar_width 
bars1 = ax.bar(r1, top_import_export['Import'], color='blue', width=bar_width, edgecolor='grey', label='Import')
bars2 = ax.bar(r2, top_import_export['Export'], color='orange', width=bar_width, edgecolor='grey', label='Export')
ax.set_xlabel('Countries', fontsize=14)
ax.set_ylabel('Total Value', fontsize=14)
ax.set_title('Import and Export Values for Top 10 Countries', fontsize=16)
ax.set_xticks(r1 + bar_width / 2) 
ax.set_xticklabels(top_import_export.index, rotation=45)
ax.legend()
from matplotlib.ticker import FuncFormatter
def currency_formatter(x, _):
    if x >= 1_000_000:
        return f'${x/1_000_000:.1f}M'  # Millions
    elif x >= 1_000:
        return f'${x/1_000:.1f}K'  # Thousands
    else:
        return f'${int(x)}'  # Actual number
ax.yaxis.set_major_formatter(FuncFormatter(currency_formatter))
st.title("Import and Export Values for Top 10 Countries")
st.pyplot(fig)


#########################################################################

#6
samtrade['Value'] = pd.to_numeric(samtrade['Value'], errors='coerce')
top_exports = samtrade[samtrade['Import_Export'] == 'Export'] \
                .groupby('Country')['Value'].sum().sort_values(ascending=False).head(3).index.tolist()

top_imports = samtrade[samtrade['Import_Export'] == 'Import'] \
                .groupby('Country')['Value'].sum().sort_values(ascending=False).head(3).index.tolist()
export_categories = samtrade[(samtrade['Country'].isin(top_exports)) & (samtrade['Import_Export'] == 'Export')]
if not export_categories.empty:
    export_category_counts = export_categories.groupby(['Country', 'Category'])['Value'].sum().unstack().fillna(0)
else:
    export_category_counts = pd.DataFrame()
import_categories = samtrade[(samtrade['Country'].isin(top_imports)) & (samtrade['Import_Export'] == 'Import')]
if not import_categories.empty:
    import_category_counts = import_categories.groupby(['Country', 'Category'])['Value'].sum().unstack().fillna(0)
else:
    import_category_counts = pd.DataFrame()
st.title('Product Categories Traded by Top Exporting and Importing Countries')
st.subheader('Export Categories by Top Exporting Countries')
if not export_category_counts.empty:
    export_chart = px.bar(export_category_counts.reset_index(),
                           x='Country',
                           y=export_category_counts.columns,
                           title='Export Categories by Country',
                           labels={'value': 'Value', 'Category': 'Category'},
                           hover_data={'Country': True})  # Show country in hover
    st.plotly_chart(export_chart)
else:
    st.write("No data available for exporting countries.")
st.subheader('Import Categories by Top Importing Countries')
if not import_category_counts.empty:
    import_chart = px.bar(import_category_counts.reset_index(),
                           x='Country',
                           y=import_category_counts.columns,
                           title='Import Categories by Country',
                           labels={'value': 'Value', 'Category': 'Category'},
                           hover_data={'Country': True})  # Show country in hover
    st.plotly_chart(import_chart)
else:
    st.write("No data available for importing countries.")

################################################################
#7
if 'Date' in samtrade.columns:
    samtrade['Date'] = pd.to_datetime(samtrade['Date'], errors='coerce')
else:
    print("")
shipping_data = samtrade.groupby('Shipping_Method')['Value'].sum()
colors = ['#FFB3BA', '#FFDFBA', '#FFFFBA']  
plt.figure(figsize=(8, 8)) 
plt.pie(
    shipping_data,
    labels=[f'{method} ({value:,.2f})' for method, value in zip(shipping_data.index, shipping_data)],
    colors=colors,
    startangle=90,
    autopct='%1.1f%%',
    pctdistance=0.85,
)
plt.axis('equal')
plt.title('Trade value by Shipping Method', fontsize=16)
st.pyplot(plt.gcf())

###############################################################

#8
payment_terms = ['All'] + list(samtrade['Payment_Terms'].unique())
selected_payment_term = st.selectbox("Select Payment Terms:", payment_terms)
if selected_payment_term == 'All':
    filtered_data = samtrade  
else:
    filtered_data = samtrade[samtrade['Payment_Terms'] == selected_payment_term]
transaction_counts = filtered_data.groupby(['Shipping_Method', 'Payment_Terms']).size().reset_index(name='Count')
fig = px.bar(
    transaction_counts,
    x='Shipping_Method',  
    y='Count',
    color='Payment_Terms',
    title=f'Usage of Shipping Methods for Payment Terms: {selected_payment_term}',
    labels={'Count': 'Number of Transactions'},
)
fig.update_layout(barmode='stack', xaxis_title='Shipping Method', yaxis_title='Number of Transactions')

st.plotly_chart(fig)

#######################################################################

#9
correlation_matrix = samtrade[['Value', 'Weight']].corr()
plt.figure(figsize=(6, 4))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', square=True, cbar_kws={"shrink": .8})
plt.title('Correlation Matrix between Transaction Value and Weight', fontsize=16)
st.pyplot(plt.gcf())

############################################

#10
value_summary = samtrade.groupby('Payment_Terms')['Value'].mean().reset_index()
plt.figure(figsize=(14, 6))
plt.plot(value_summary['Payment_Terms'], value_summary['Value'], marker='o', color='b', linestyle='-', linewidth=2, markersize=8)
plt.scatter(value_summary['Payment_Terms'], value_summary['Value'], color='red', s=100, edgecolor='black')
plt.title('Average Transaction Value by Payment Terms', fontsize=16)
plt.xlabel('Payment Terms', fontsize=14)
plt.ylabel('Average Value', fontsize=14)
plt.xticks(rotation=45)  # Rotate x labels for better readability
plt.grid(True)
st.pyplot(plt.gcf())


############################################





<class 'pandas.core.frame.DataFrame'>
Index: 3001 entries, 12879 to 4087
Data columns (total 16 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   Transaction_ID   3001 non-null   object 
 1   Country          3001 non-null   object 
 2   Product          3001 non-null   object 
 3   Import_Export    3001 non-null   object 
 4   Quantity         3001 non-null   int64  
 5   Value            3001 non-null   float64
 6   Date             3001 non-null   object 
 7   Category         3001 non-null   object 
 8   Port             3001 non-null   object 
 9   Customs_Code     3001 non-null   int64  
 10  Weight           3001 non-null   float64
 11  Shipping_Method  3001 non-null   object 
 12  Supplier         3001 non-null   object 
 13  Customer         3001 non-null   object 
 14  Invoice_Number   3001 non-null   int64  
 15  Payment_Terms    3001 non-null   object 
dtypes: float64(2), int64(3), object(11)
memory usage: 398.6+ KB



DeltaGenerator()