# Indian Import Dataset

### Context
India is one of the fastest developing nations of the world and trade between nations is the major component of any developing nation. This dataset includes the trade data for India for commodities in the HS2 basket.

### Content
The dataset consists of trade values for export and import of commodities in million US$. The dataset is tidy and each row consists of a single observation.

### Acknowledgements
The data is scraped using Selenium Webdriver from the Department of Commerce, Government of India.

### Inspiration
A few questions that can be answered using this dataset are:

1. What did India export the most in any given year?
2. Which commodity forms a major chunk of trade? Does it conform to theories of international trade?
3. How has the trade between India and any given country grown over time?
4. A visualization of this dataset would be a great way to explore more such questions.

In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [19]:
df = pd.read_csv("2018-2010_import.csv")
pd.options.display.max_colwidth = 100

In [20]:
df.head()

Unnamed: 0,HSCode,Commodity,value,country,year
0,5,"PRODUCTS OF ANIMAL ORIGIN, NOT ELSEWHERE SPECIFIED OR INCLUDED.",0.0,AFGHANISTAN TIS,2018
1,7,EDIBLE VEGETABLES AND CERTAIN ROOTS AND TUBERS.,12.38,AFGHANISTAN TIS,2018
2,8,EDIBLE FRUIT AND NUTS; PEEL OR CITRUS FRUIT OR MELONS.,268.6,AFGHANISTAN TIS,2018
3,9,"COFFEE, TEA, MATE AND SPICES.",35.48,AFGHANISTAN TIS,2018
4,11,PRODUCTS OF THE MILLING INDUSTRY; MALT; STARCHES; INULIN; WHEAT GLUTEN.,,AFGHANISTAN TIS,2018


In [21]:
df.tail()

Unnamed: 0,HSCode,Commodity,value,country,year
76119,81,OTHER BASE METALS; CERMETS; ARTICLES THEREOF.,0.14,ZIMBABWE,2010
76120,82,"TOOLS IMPLEMENTS, CUTLERY, SPOONS AND FORKS, OF BASE METAL; PARTS THEREOF OF BASE METAL.",0.0,ZIMBABWE,2010
76121,84,"NUCLEAR REACTORS, BOILERS, MACHINERY AND MECHANICAL APPLIANCES; PARTS THEREOF.",,ZIMBABWE,2010
76122,85,"ELECTRICAL MACHINERY AND EQUIPMENT AND PARTS THEREOF; SOUND RECORDERS AND REPRODUCERS, TELEVISIO...",,ZIMBABWE,2010
76123,99,MISCELLANEOUS GOODS.,,ZIMBABWE,2010


In [6]:
print(df.isnull().sum())

HSCode           0
Commodity        0
value        11588
country          0
year             0
dtype: int64


In [11]:
df.HSCode.unique()

array([ 5,  7,  8,  9, 11, 12, 13, 20, 25, 27, 39, 41, 49, 51, 52, 57, 68,
       71, 72, 74, 81, 82, 84, 85, 90, 96, 97, 98, 99, 18, 26, 33, 40, 48,
       64, 70, 73, 76, 83, 87, 94, 28, 29, 30, 31, 38, 45, 47, 78, 86, 59,
       63, 34, 44, 69, 79, 54, 22, 32,  1,  3,  4,  6, 10, 15, 23, 24, 35,
       42, 55, 93, 95, 17,  2, 19, 21, 37, 46, 50, 53, 56, 58, 60, 61, 62,
       65, 66, 67, 75, 88, 89, 91, 92, 16, 43, 80, 14, 36], dtype=int64)

In [17]:
df.year.unique()

array([2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010], dtype=int64)