# Pandas: Data Exploratory

In [2]:
import numpy as np
import pandas as pd

## 1. Introduction to Pandas
Pandas is a library specializing in data manipulation and analysis, using two powerful data structures: series and dataframe. Pandas is always used with NumPy.

### 1.1. Series
Series can be thought as 1-dimensional arrays with index.

In [1]:
import numpy as np
import pandas as pd

In [2]:
pd.Series(['a', 'b', 'c', 'd', 'e'])

0    a
1    b
2    c
3    d
4    e
dtype: object

In [3]:
pd.Series(np.arange(5), name='digits')

0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
Name: digits, dtype: int64

### 1.2. Data types
Like array, series contains elements of homogeneous data types.

In [4]:
import numpy as np
import pandas as pd

#### String

In [5]:
pd.Series(['a', 'b', 'c'])

0    a
1    b
2    c
dtype: object

In [6]:
pd.Series([1, 2, 3]).astype(str)

0    1
1    2
2    3
dtype: object

#### Numeric

In [7]:
pd.Series(['01', '02', '03']).astype(int)

0    1
1    2
2    3
dtype: int64

In [8]:
pd.Series(['01', '02', '03']).astype(float)

0    1.0
1    2.0
2    3.0
dtype: float64

#### Date and time

In [9]:
pd.Series(['2020/01/01', '2020/01/02']).astype('datetime64')

0   2020-01-01
1   2020-01-02
dtype: datetime64[ns]

In [10]:
pd.to_datetime(pd.Series(['Jan.01-2020', '20200102']))

0   2020-01-01
1   2020-01-02
dtype: datetime64[ns]

#### Boolean

In [11]:
pd.Series([1, 0, 0, 1, 1]).astype(bool)

0     True
1    False
2    False
3     True
4     True
dtype: bool

#### Categorical
The categorical data type contains ordered items.

In [12]:
medal = pd.Categorical(
    values=['Gold', 'Bronze', 'Silver', 'Bronze', 'Gold', 'Gold', 'Bronze', 'Silver'],
    categories=['Bronze', 'Silver', 'Gold']
)

medal.sort_values()

['Bronze', 'Bronze', 'Bronze', 'Silver', 'Silver', 'Gold', 'Gold', 'Gold']
Categories (3, object): ['Bronze', 'Silver', 'Gold']

### 1.3. Dataframe
Dataframe is a tabular data type, containing series placed side by side. A dataframe can be thought as a labeled matrix. Here is the dataframe axes convention:
- Axis 0 is the vertical axis, contains row names; each row is called a record or an observation
- Axis 1 is the horizontal axis, contains column names; each column is called an attribute, a field, a variable or a dimension.

In [13]:
import numpy as np
import pandas as pd

#### Initialization

In [8]:
dfTemp = {
    'product': pd.Series(['Laptop', 'Mouse', 'Headphone', 'USB']),
    'price': pd.Series(['$1000', '$20', '$50']),
    'stock': pd.Series([15, 100, 50, 100])
}
pd.DataFrame(dfTemp)

Unnamed: 0,product,price,stock
0,Laptop,$1000,15
1,Mouse,$20,100
2,Headphone,$50,50
3,USB,,100


#### Reading files

#### Reading CSV files
CSV (Comma-Separated Values) is a common type file format for storing tabular data.

#### Reading JSON files
JSON (JavaScript Object Notation) files have the same structure as Python dictionaries. In Pandas, the `read_json()` function reads JSON files and the `DataFrame.from_dict()` function reads dictionaries.

In [4]:
columns = ['Date', 'AAPL.Open', 'AAPL.High', 'AAPL.Low', 'AAPL.Close', 'AAPL.Volume']
pd.read_csv('../data/finance_charts_apple.csv', usecols=columns).head()

Unnamed: 0,Date,AAPL.Open,AAPL.High,AAPL.Low,AAPL.Close,AAPL.Volume
0,2015-02-17,127.489998,128.880005,126.919998,127.830002,63152400
1,2015-02-18,127.629997,128.779999,127.449997,128.720001,44891700
2,2015-02-19,128.479996,129.029999,128.330002,128.449997,37362400
3,2015-02-20,128.619995,129.5,128.050003,129.5,48948400
4,2015-02-23,130.020004,133.0,129.660004,133.0,70974100


In [17]:
dfTemp = {
    'word': {'0':'Python','1':'Jupyter','2':'Anaconda'},
    'length': {'0':6,'1':7,'2':8}
}
pd.DataFrame.from_dict(dfTemp, orient='columns')

Unnamed: 0,word,length
0,Python,6
1,Jupyter,7
2,Anaconda,8


In [18]:
dfTemp = {
    '0': {'word':'Python', 'length':6},
    '1': {'word':'Jupyter', 'length':7},
    '2': {'word':'Anaconda', 'length':8}
}
pd.DataFrame.from_dict(dfTemp, orient='index')

Unnamed: 0,word,length
0,Python,6
1,Jupyter,7
2,Anaconda,8


### 1.4. Useful Pandas settings

```python
# float display of 2 decimal places
pd.options.display.float_format = '{:,.2f}'.format

# display up to 100 characters each cell
pd.options.display.max_colwidth = 1000

# display up to 500 columns
pd.options.display.max_columns = 500

# display up to 1000 rows
pd.options.display.max_rows = 1000
```

## 2. Data exploratory
Data exploratory is an initial analysis step that summarizes the main characteristics to understand what is in the dataset. Data wrangling and data visualization are used for exploratory; however, this topic mentions only data wrangling using Pandas.

### 2.1. Overview

In [2]:
import numpy as np
import pandas as pd

In [29]:
columns = ['video_id', 'trending_date', 'channel_title', 'category_id', 'views', 'likes', 'dislikes', 'ratings_disabled']
dfYoutube = pd.read_csv('../data/youtube_trending.csv', usecols=columns)

#### Rows observing

In [30]:
dfYoutube.head()

Unnamed: 0,video_id,trending_date,channel_title,category_id,views,likes,dislikes,ratings_disabled
0,2kyS6SvSYSE,2017-11-14,CaseyNeistat,22,748374,57527,2966,False
1,1ZAPwfrtAFY,2017-11-14,LastWeekTonight,24,2418783,97185,6146,False
2,5qpjK5DgCt4,2017-11-14,Rudy Mancuso,23,3191434,146033,5339,False
3,puqaWrEC7tY,2017-11-14,Good Mythical Morning,24,343168,10172,666,False
4,d380meD0W0M,2017-11-14,nigahiga,24,2095731,132235,1989,False


In [31]:
dfYoutube.tail(3)

Unnamed: 0,video_id,trending_date,channel_title,category_id,views,likes,dislikes,ratings_disabled
9878,NEmFS50lsTw,2018-02-01,SKITTLESbrand,24,483127,3924,362,False
9879,NZ0ImXT1FZk,2018-02-01,Brave Wilderness,15,1417253,44358,893,False
9880,t3z_pdr6z8w,2018-02-01,Jackie Aina,26,877585,115391,1621,False


In [32]:
dfYoutube.nlargest(n=5, columns='views')

Unnamed: 0,video_id,trending_date,channel_title,category_id,views,likes,dislikes,ratings_disabled
6103,FlsCjmMhFmw,2017-12-14,YouTube Spotlight,24,149376127,3093544,1643059,False
5860,FlsCjmMhFmw,2017-12-13,YouTube Spotlight,24,137843120,3014471,1602383,False
5629,FlsCjmMhFmw,2017-12-12,YouTube Spotlight,24,125432237,2912702,1545015,False
5386,FlsCjmMhFmw,2017-12-11,YouTube Spotlight,24,113874632,2811215,1470383,False
2558,TyHvyGVs42U,2017-11-26,LuisFonsiVEVO,10,102012605,2376636,117196,False


In [33]:
dfYoutube.nsmallest(n=5, columns=['category_id', 'likes'])

Unnamed: 0,video_id,trending_date,channel_title,category_id,views,likes,dislikes,ratings_disabled
1416,Kn5UgGQukYQ,2017-11-21,hudsonunionsociety,1,15058,0,0,True
1645,Kn5UgGQukYQ,2017-11-22,hudsonunionsociety,1,34207,0,0,True
1890,Kn5UgGQukYQ,2017-11-23,hudsonunionsociety,1,36137,0,0,True
2093,Kn5UgGQukYQ,2017-11-24,hudsonunionsociety,1,36579,0,0,True
2306,Kn5UgGQukYQ,2017-11-25,hudsonunionsociety,1,36931,0,0,True


#### Features analysis

In [36]:
dfYoutube.shape

(9881, 8)

In [35]:
dfYoutube.columns

Index(['video_id', 'trending_date', 'channel_title', 'category_id', 'views',
       'likes', 'dislikes', 'ratings_disabled'],
      dtype='object')

In [34]:
# Pandas automatically detects the data type for each column.
dfYoutube.dtypes

video_id            object
trending_date       object
channel_title       object
category_id          int64
views                int64
likes                int64
dislikes             int64
ratings_disabled      bool
dtype: object

#### Statistical summary

In [37]:
dfYoutube.describe()

Unnamed: 0,category_id,views,likes,dislikes
count,9881.0,9881.0,9881.0,9881.0
mean,19.970347,1288900.0,47460.89,3066.993
std,7.544083,5110157.0,170513.2,38892.08
min,1.0,687.0,0.0,0.0
25%,17.0,89609.0,2031.0,85.0
50%,24.0,311621.0,9003.0,325.0
75%,25.0,1001367.0,29425.0,1131.0
max,43.0,149376100.0,3093544.0,1643059.0


In [38]:
# correlation matrix
dfYoutube.corr()

Unnamed: 0,category_id,views,likes,dislikes,ratings_disabled
category_id,1.0,-0.068531,-0.092913,0.002402,-0.052243
views,-0.068531,1.0,0.880433,0.673456,-0.010271
likes,-0.092913,0.880433,1.0,0.530022,-0.019651
dislikes,0.002402,0.673456,0.530022,1.0,-0.005567
ratings_disabled,-0.052243,-0.010271,-0.019651,-0.005567,1.0


#### Missing values
Our approach is to calculate the rate of missing values in each column.

In [39]:
dfYoutube.isnull().sum()

video_id            90
trending_date        0
channel_title        0
category_id          0
views                0
likes                0
dislikes             0
ratings_disabled     0
dtype: int64

In [40]:
dfYoutube.isna().mean().map('{:.0%}'.format)

video_id            1%
trending_date       0%
channel_title       0%
category_id         0%
views               0%
likes               0%
dislikes            0%
ratings_disabled    0%
dtype: object

### 2.2. In-depth exploring

In [36]:
import numpy as np
import pandas as pd

In [41]:
columns = ['video_id', 'trending_date', 'channel_title', 'category_id', 'views', 'likes', 'dislikes', 'ratings_disabled']
dfYoutube = pd.read_csv('../data/youtube_trending.csv', usecols=columns)
dfYoutube.head()

Unnamed: 0,video_id,trending_date,channel_title,category_id,views,likes,dislikes,ratings_disabled
0,2kyS6SvSYSE,2017-11-14,CaseyNeistat,22,748374,57527,2966,False
1,1ZAPwfrtAFY,2017-11-14,LastWeekTonight,24,2418783,97185,6146,False
2,5qpjK5DgCt4,2017-11-14,Rudy Mancuso,23,3191434,146033,5339,False
3,puqaWrEC7tY,2017-11-14,Good Mythical Morning,24,343168,10172,666,False
4,d380meD0W0M,2017-11-14,nigahiga,24,2095731,132235,1989,False


#### Data selection

In [23]:
dfYoutube.video_id

0       2kyS6SvSYSE
1       1ZAPwfrtAFY
2       5qpjK5DgCt4
3       puqaWrEC7tY
4       d380meD0W0M
           ...     
9876    OVnxBcS5qWo
9877    ltzy5vRmN8Q
9878    NEmFS50lsTw
9879    NZ0ImXT1FZk
9880    t3z_pdr6z8w
Name: video_id, Length: 9881, dtype: object

In [24]:
dfYoutube['trending_date']

0       2017-11-14
1       2017-11-14
2       2017-11-14
3       2017-11-14
4       2017-11-14
           ...    
9876    2018-02-01
9877    2018-02-01
9878    2018-02-01
9879    2018-02-01
9880    2018-02-01
Name: trending_date, Length: 9881, dtype: object

In [25]:
# selecting multiple columns as a new dataframe
dfYoutube[['views', 'likes', 'dislikes']].head()

Unnamed: 0,views,likes,dislikes
0,748374,57527,2966
1,2418783,97185,6146
2,3191434,146033,5339
3,343168,10172,666
4,2095731,132235,1989


The `iloc` and `loc` attributes turn a dataframe into a slicable object with 2 dimensions, like Numpy array. While `iloc` uses index, `loc` uses label on slicing. Notice that `loc` does not exclude the endpoint.

In [42]:
dfYoutube.iloc[2:7, :4]

Unnamed: 0,video_id,trending_date,channel_title,category_id
2,5qpjK5DgCt4,2017-11-14,Rudy Mancuso,23
3,puqaWrEC7tY,2017-11-14,Good Mythical Morning,24
4,d380meD0W0M,2017-11-14,nigahiga,24
5,gHZ1Qz0KiKM,2017-11-14,iJustine,28
6,39idVpFF7NQ,2017-11-14,Saturday Night Live,24


In [43]:
subset = ['category_id', 'trending_date']
dfYoutube.loc[:5, subset]

Unnamed: 0,category_id,trending_date
0,22,2017-11-14
1,24,2017-11-14
2,23,2017-11-14
3,24,2017-11-14
4,24,2017-11-14
5,28,2017-11-14


#### Filtering

In [44]:
dfYoutube[dfYoutube.views >= 100e6]

Unnamed: 0,video_id,trending_date,channel_title,category_id,views,likes,dislikes,ratings_disabled
2558,TyHvyGVs42U,2017-11-26,LuisFonsiVEVO,10,102012605,2376636,117196,False
5174,FlsCjmMhFmw,2017-12-10,YouTube Spotlight,24,100911567,2656659,1353647,False
5386,FlsCjmMhFmw,2017-12-11,YouTube Spotlight,24,113874632,2811215,1470383,False
5629,FlsCjmMhFmw,2017-12-12,YouTube Spotlight,24,125432237,2912702,1545015,False
5860,FlsCjmMhFmw,2017-12-13,YouTube Spotlight,24,137843120,3014471,1602383,False
6103,FlsCjmMhFmw,2017-12-14,YouTube Spotlight,24,149376127,3093544,1643059,False


In [45]:
dfYoutube[dfYoutube.channel_title.str.contains('VEVO')].head()

Unnamed: 0,video_id,trending_date,channel_title,category_id,views,likes,dislikes,ratings_disabled
32,n1WpP7iowLc,2017-11-14,EminemVEVO,10,17158531,787419,43420,False
40,PaJCFHXcWmM,2017-11-14,U2VEVO,10,60506,5389,106,False
52,9t9u_yPEidY,2017-11-14,JenniferLopezVEVO,10,9548677,190083,15015,False
62,ujyTQNNjjDU,2017-11-14,GEazyMusicVEVO,10,2642930,115795,3055,False
73,lY_0mkYDZDU,2017-11-14,fosterthepeopleVEVO,10,303956,18603,585,False


In [46]:
dfYoutube[(dfYoutube.views >= 1e7) | (dfYoutube.likes >= 1e5)].head()

Unnamed: 0,video_id,trending_date,channel_title,category_id,views,likes,dislikes,ratings_disabled
2,5qpjK5DgCt4,2017-11-14,Rudy Mancuso,23,3191434,146033,5339,False
4,d380meD0W0M,2017-11-14,nigahiga,24,2095731,132235,1989,False
12,5E4ZBSInqUU,2017-11-14,marshmello,10,687582,114188,1333,False
32,n1WpP7iowLc,2017-11-14,EminemVEVO,10,17158531,787419,43420,False
52,9t9u_yPEidY,2017-11-14,JenniferLopezVEVO,10,9548677,190083,15015,False


#### Sorting

In [47]:
dfYoutube.sort_values(by=['trending_date', 'views'], ascending=[True, False]).head()

Unnamed: 0,video_id,trending_date,channel_title,category_id,views,likes,dislikes,ratings_disabled
69,2Vv-BfVoq4g,2017-11-14,Ed Sheeran,10,33523622,1634124,21082,False
32,n1WpP7iowLc,2017-11-14,EminemVEVO,10,17158531,787419,43420,False
148,9wg3v-01yKQ,2017-11-14,HarryStylesVEVO,10,9632678,810895,16139,False
52,9t9u_yPEidY,2017-11-14,JenniferLopezVEVO,10,9548677,190083,15015,False
68,Jw1Y-zhQURU,2017-11-14,John Lewis,26,7224515,55681,10247,False


#### Unique values
Unique/distinct values is the basis for many advanced data manipulation techniques.

In [48]:
dfFish = pd.read_csv('../data/us_fishery_trade.csv')
dfFish.head()

Unnamed: 0,Year,Month,Product Name,Country Name,Month number,Value,Feature,Unit
0,2010,January,SABLEFISH FRESH,UNITED ARAB EMIRATES,1,2297,EXP Quantity,kg
1,2010,January,SABLEFISH FRESH,JAPAN,1,16025,EXP Quantity,kg
2,2010,January,SABLEFISH FRESH,JAPAN,1,63437,EXP Quantity,kg
3,2010,January,MONKFISH FRESH,CANADA,1,579,EXP Quantity,kg
4,2010,January,MONKFISH FRESH,CANADA,1,7975,EXP Quantity,kg


In [49]:
dfFish['Product Name'].unique()

array(['SABLEFISH FRESH', 'MONKFISH FRESH', 'WHITEFISH FRESH',
       'WHITEFISH FROZEN', 'MONKFISH FROZEN', 'BUTTERFISH FROZEN',
       'SABLEFISH FROZEN', 'SCORPIONFISH (SCORPAENIDAE) FROZEN',
       'WHITEFISH FILLET FRESH', 'WOLFFISH FILLET BLOCKS FROZEN > 4.5KG',
       'WOLFFISH FILLET FROZEN', 'CRAWFISH FRESHWATER FROZEN',
       'CRAWFISH FRESHWATER PEELED',
       'SABLEFISH SCALED WHETHER OR NOT DRESS FRESH NOT > 6.8KG',
       'WHITEFISH MEAT FROZEN > 6.8KG', 'SABLEFISH FRESH NOT > 6.8KG',
       'WHITEFISH MEAT FRESH',
       'JELLYFISH (RHOPILEMA SPP.) LIVE/FRESH/FROZEN/DRIED/SALTED/BRINE/SMOKED',
       'JELLYFISH PREPARED/PRESERVED'], dtype=object)

In [50]:
dfFish[['Feature', 'Unit']].drop_duplicates()

Unnamed: 0,Feature,Unit
0,EXP Quantity,kg
17,IMP Quantity,kg
8737,EXP Value,USD
8754,IMP Value,USD


#### Data aggregation
Data aggregation is the process splitting the dataset into groups and applying an aggregate function (such as sum, mean, count) to each group.

In [53]:
dfFish.head()

Unnamed: 0,Year,Month,Product Name,Country Name,Month number,Value,Feature,Unit
0,2010,January,SABLEFISH FRESH,UNITED ARAB EMIRATES,1,2297,EXP Quantity,kg
1,2010,January,SABLEFISH FRESH,JAPAN,1,16025,EXP Quantity,kg
2,2010,January,SABLEFISH FRESH,JAPAN,1,63437,EXP Quantity,kg
3,2010,January,MONKFISH FRESH,CANADA,1,579,EXP Quantity,kg
4,2010,January,MONKFISH FRESH,CANADA,1,7975,EXP Quantity,kg


In [54]:
# number of rows
dfFish.groupby('Product Name').size()

Product Name
BUTTERFISH FROZEN                                                         1398
CRAWFISH FRESHWATER FROZEN                                                1032
CRAWFISH FRESHWATER PEELED                                                1826
JELLYFISH (RHOPILEMA SPP.) LIVE/FRESH/FROZEN/DRIED/SALTED/BRINE/SMOKED     932
JELLYFISH PREPARED/PRESERVED                                               920
MONKFISH FRESH                                                            1816
MONKFISH FROZEN                                                           1288
SABLEFISH FRESH                                                            718
SABLEFISH FRESH NOT > 6.8KG                                                392
SABLEFISH FROZEN                                                          2978
SABLEFISH SCALED WHETHER OR NOT DRESS FRESH NOT > 6.8KG                     44
SCORPIONFISH (SCORPAENIDAE) FROZEN                                        1674
WHITEFISH FILLET FRESH                 

In [55]:
# number of not null values
dfFish.groupby('Product Name').count()

Unnamed: 0_level_0,Year,Month,Country Name,Month number,Value,Feature,Unit
Product Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
BUTTERFISH FROZEN,1398,1398,1398,1398,1398,1398,1398
CRAWFISH FRESHWATER FROZEN,1032,1032,1032,1032,1032,1032,1032
CRAWFISH FRESHWATER PEELED,1826,1826,1826,1826,1826,1826,1826
JELLYFISH (RHOPILEMA SPP.) LIVE/FRESH/FROZEN/DRIED/SALTED/BRINE/SMOKED,932,932,932,932,932,932,932
JELLYFISH PREPARED/PRESERVED,920,920,920,920,920,920,920
MONKFISH FRESH,1816,1816,1816,1816,1816,1816,1816
MONKFISH FROZEN,1288,1288,1288,1288,1288,1288,1288
SABLEFISH FRESH,718,718,718,718,718,718,718
SABLEFISH FRESH NOT > 6.8KG,392,392,392,392,392,392,392
SABLEFISH FROZEN,2978,2978,2978,2978,2978,2978,2978


In [56]:
# total export value of each product
dfFish[dfFish['Feature']=='EXP Value']\
    .groupby('Product Name').sum()[['Value']].reset_index()

Unnamed: 0,Product Name,Value
0,BUTTERFISH FROZEN,10604660
1,CRAWFISH FRESHWATER FROZEN,5393263
2,JELLYFISH (RHOPILEMA SPP.) LIVE/FRESH/FROZEN/DRIED/SALTED/BRINE/SMOKED,5489508
3,JELLYFISH PREPARED/PRESERVED,8198507
4,MONKFISH FRESH,32555268
5,MONKFISH FROZEN,93921677
6,SABLEFISH FRESH,66094664
7,SABLEFISH FROZEN,799896744
8,SCORPIONFISH (SCORPAENIDAE) FROZEN,73262535


In [54]:
# total trade each year
dfFish.groupby(['Feature', 'Year', 'Unit']).sum()['Value'].reset_index()

Unnamed: 0,Feature,Year,Unit,Value
0,EXP Quantity,2010,kg,15578318
1,EXP Quantity,2011,kg,28106436
2,EXP Quantity,2012,kg,15054158
3,EXP Quantity,2013,kg,13856966
4,EXP Quantity,2014,kg,19607838
5,EXP Quantity,2015,kg,17477174
6,EXP Quantity,2016,kg,10759863
7,EXP Quantity,2017,kg,12236928
8,EXP Quantity,2018,kg,9972012
9,EXP Quantity,2019,kg,9990474


In [53]:
dfFish[dfFish.Feature.str.contains('Value')]\
    .groupby(['Feature', 'Year', 'Month number']).sum().reset_index()\
    .groupby(['Feature', 'Month number']).mean()['Value'].reset_index()

Unnamed: 0,Feature,Month number,Value
0,EXP Value,1,4210348.9
1,EXP Value,2,3185211.9
2,EXP Value,3,3465167.2
3,EXP Value,4,8694505.8
4,EXP Value,5,14953813.7
5,EXP Value,6,15216529.7
6,EXP Value,7,12062280.3
7,EXP Value,8,9735869.9
8,EXP Value,9,10347673.5
9,EXP Value,10,10211269.2


In [59]:
# apply multiple aggregate functions at once
dfFish.groupby('Product Name').agg([sum, np.mean, np.size])[['Value']].reset_index()

Unnamed: 0_level_0,Product Name,Value,Value,Value
Unnamed: 0_level_1,Unnamed: 1_level_1,sum,mean,size
0,BUTTERFISH FROZEN,60432604,43227.9,1398
1,CRAWFISH FRESHWATER FROZEN,136660222,132422.7,1032
2,CRAWFISH FRESHWATER PEELED,677768550,371176.64,1826
3,JELLYFISH (RHOPILEMA SPP.) LIVE/FRESH/FROZEN/DRIED/SALTED/BRINE/SMOKED,28905543,31014.53,932
4,JELLYFISH PREPARED/PRESERVED,43337180,47105.63,920
5,MONKFISH FRESH,38634548,21274.53,1816
6,MONKFISH FROZEN,120627952,93655.24,1288
7,SABLEFISH FRESH,84580465,117800.09,718
8,SABLEFISH FRESH NOT > 6.8KG,5882279,15005.81,392
9,SABLEFISH FROZEN,895572564,300729.54,2978
