Script to analyse data related to fisheries. These data concerns:

* Target 14.4 (FMSY/F and B/BMSY)
* Target 14.6 (TAC/Catch)
* Target 14.a (SAD/TAC) 

Data comes from several sources:

For all indicators, we need data on Catches by Country (ICES Official Catches data)

Additionally, for each indicator we need:

* FMSY/F and B/BMSY: Stock Assessment data (ICES)
* TAC/Catch: TAC data (Carpenter or, alternatively, from EC PDFs)
* SAD/TAC: TAC (Carpenter or, alternatively, from OJ PDFs) and SAD (ICES data or Carpenter)

If we want to get all indicators with the same data, then merging procedure is: 

Catches by Country <-> Stock Assessment <-> TAC <-> SAD

If we calculate indicators separately, we can do three merges and get:
* Catches by Country <-> Stock Assessment: FMSY/F and B/BMSY
* Catches by Country <-> TAC: TAC/Catch
* Catches by Country <-> TAC <-> SAD: SAD/TAC

Notes from Rudi's meeting
1. Difference between officialNominalCatces and catches from StockAssesment dataset. For what stock do we have large differences? <10% would be alright. 
2. Stock Assesment is not done every year. Rudi: Do they provide a trend for those years that they don't assess? It's odd to use different years for different stocks, but it's possible. 
3. BMSY in ICES. Btrigger Bpa is the same as Btrigger, go for that one. 
4. Effort is used because it's hard to track the catches. Send questions to Rudi


In [406]:
import os
import pandas as pd
import numpy as np

In [3]:
pd.set_option('display.max_columns', 500)

In [4]:
countries=['Belgium','Bulgaria','Cyprus', 'Greece','Germany','Croatia','Italy', 
           'Denmark','Estonia','Spain','Finland','France','Ireland','Lithuania',
           'Latvia','Malta','Netherlands','Poland','Portugal', 'Romania',
           'Sweden','United Kingdom of Great Britain and Northern Ireland', "United Kingdom of GB"]

In [5]:
country_to_abbrev = {
    "Andorra": "AD",
    "United Arab Emirates": "AE",
    "Afghanistan": "AF",
    "Antigua and Barbuda": "AG",
    "Anguilla": "AI",
    "Albania": "AL",
    "Armenia": "AM",
    "Angola": "AO",
    "Antarctica": "AQ",
    "Argentina": "AR",
    "American Samoa": "AS",
    "Austria": "AT",
    "Australia": "AU",
    "Aruba": "AW",
    "Åland Islands": "AX",
    "Azerbaijan": "AZ",
    "Bosnia and Herzegovina": "BA",
    "Barbados": "BB",
    "Bangladesh": "BD",
    "Belgium": "BE",
    "Burkina Faso": "BF",
    "Bulgaria": "BG",
    "Bahrain": "BH",
    "Burundi": "BI",
    "Benin": "BJ",
    "Saint Barthélemy": "BL",
    "Bermuda": "BM",
    "Brunei Darussalam": "BN",
    "Bolivia (Plurinational State of)": "BO",
    "Bonaire, Sint Eustatius and Saba": "BQ",
    "Brazil": "BR",
    "Bahamas": "BS",
    "Bhutan": "BT",
    "Bouvet Island": "BV",
    "Botswana": "BW",
    "Belarus": "BY",
    "Belize": "BZ",
    "Canada": "CA",
    "Cocos (Keeling) Islands": "CC",
    "Congo, Democratic Republic of the": "CD",
    "Central African Republic": "CF",
    "Congo": "CG",
    "Switzerland": "CH",
    "Côte d'Ivoire": "CI",
    "Cook Islands": "CK",
    "Chile": "CL",
    "Cameroon": "CM",
    "China": "CN",
    "Colombia": "CO",
    "Costa Rica": "CR",
    "Cuba": "CU",
    "Cabo Verde": "CV",
    "Curaçao": "CW",
    "Christmas Island": "CX",
    "Cyprus": "CY",
    "Czechia": "CZ",
    "Germany": "DE",
    "Djibouti": "DJ",
    "Denmark": "DK",
    "Dominica": "DM",
    "Dominican Republic": "DO",
    "Algeria": "DZ",
    "Ecuador": "EC",
    "Estonia": "EE",
    "Egypt": "EG",
    "Western Sahara": "EH",
    "Eritrea": "ER",
    "Spain": "ES",
    "Ethiopia": "ET",
    "Finland": "FI",
    "Fiji": "FJ",
    "Falkland Islands (Malvinas)": "FK",
    "Micronesia (Federated States of)": "FM",
    "Faroe Islands": "FO",
    "France": "FR",
    "Gabon": "GA",
    "United Kingdom of Great Britain and Northern Ireland": "UK", #original is GB, Eurostat uses UK
    "United Kingdom of GB": "GB",
    "Grenada": "GD",
    "Georgia": "GE",
    "French Guiana": "GF",
    "Guernsey": "GG",
    "Ghana": "GH",
    "Gibraltar": "GI",
    "Greenland": "GL",
    "Gambia": "GM",
    "Guinea": "GN",
    "Guadeloupe": "GP",
    "Equatorial Guinea": "GQ",
    "Greece": "EL", #original ir GR, Eurostat uses EL
    "South Georgia and the South Sandwich Islands": "GS",
    "Guatemala": "GT",
    "Guam": "GU",
    "Guinea-Bissau": "GW",
    "Guyana": "GY",
    "Hong Kong": "HK",
    "Heard Island and McDonald Islands": "HM",
    "Honduras": "HN",
    "Croatia": "HR",
    "Haiti": "HT",
    "Hungary": "HU",
    "Indonesia": "ID",
    "Ireland": "IE",
    "Israel": "IL",
    "Isle of Man": "IM",
    "India": "IN",
    "British Indian Ocean Territory": "IO",
    "Iraq": "IQ",
    "Iran (Islamic Republic of)": "IR",
    "Iceland": "IS",
    "Italy": "IT",
    "Jersey": "JE",
    "Jamaica": "JM",
    "Jordan": "JO",
    "Japan": "JP",
    "Kenya": "KE",
    "Kyrgyzstan": "KG",
    "Cambodia": "KH",
    "Kiribati": "KI",
    "Comoros": "KM",
    "Saint Kitts and Nevis": "KN",
    "Korea (Democratic People's Republic of)": "KP",
    "Korea, Republic of": "KR",
    "Kuwait": "KW",
    "Cayman Islands": "KY",
    "Kazakhstan": "KZ",
    "Lao People's Democratic Republic": "LA",
    "Lebanon": "LB",
    "Saint Lucia": "LC",
    "Liechtenstein": "LI",
    "Sri Lanka": "LK",
    "Liberia": "LR",
    "Lesotho": "LS",
    "Lithuania": "LT",
    "Luxembourg": "LU",
    "Latvia": "LV",
    "Libya": "LY",
    "Morocco": "MA",
    "Monaco": "MC",
    "Moldova, Republic of": "MD",
    "Montenegro": "ME",
    "Saint Martin (French part)": "MF",
    "Madagascar": "MG",
    "Marshall Islands": "MH",
    "North Macedonia": "MK",
    "Mali": "ML",
    "Myanmar": "MM",
    "Mongolia": "MN",
    "Macao": "MO",
    "Northern Mariana Islands": "MP",
    "Martinique": "MQ",
    "Mauritania": "MR",
    "Montserrat": "MS",
    "Malta": "MT",
    "Mauritius": "MU",
    "Maldives": "MV",
    "Malawi": "MW",
    "Mexico": "MX",
    "Malaysia": "MY",
    "Mozambique": "MZ",
    "Namibia": "NA",
    "New Caledonia": "NC",
    "Niger": "NE",
    "Norfolk Island": "NF",
    "Nigeria": "NG",
    "Nicaragua": "NI",
    "Netherlands": "NL",
    "Norway": "NO",
    "Nepal": "NP",
    "Nauru": "NR",
    "Niue": "NU",
    "New Zealand": "NZ",
    "Oman": "OM",
    "Panama": "PA",
    "Peru": "PE",
    "French Polynesia": "PF",
    "Papua New Guinea": "PG",
    "Philippines": "PH",
    "Pakistan": "PK",
    "Poland": "PL",
    "Saint Pierre and Miquelon": "PM",
    "Pitcairn": "PN",
    "Puerto Rico": "PR",
    "Palestine, State of": "PS",
    "Portugal": "PT",
    "Palau": "PW",
    "Paraguay": "PY",
    "Qatar": "QA",
    "Réunion": "RE",
    "Romania": "RO",
    "Serbia": "RS",
    "Russian Federation": "RU",
    "Rwanda": "RW",
    "Saudi Arabia": "SA",
    "Solomon Islands": "SB",
    "Seychelles": "SC",
    "Sudan": "SD",
    "Sweden": "SE",
    "Singapore": "SG",
    "Saint Helena, Ascension and Tristan da Cunha": "SH",
    "Slovenia": "SI",
    "Svalbard and Jan Mayen": "SJ",
    "Slovakia": "SK",
    "Sierra Leone": "SL",
    "San Marino": "SM",
    "Senegal": "SN",
    "Somalia": "SO",
    "Suriname": "SR",
    "South Sudan": "SS",
    "Sao Tome and Principe": "ST",
    "El Salvador": "SV",
    "Sint Maarten (Dutch part)": "SX",
    "Syrian Arab Republic": "SY",
    "Eswatini": "SZ",
    "Turks and Caicos Islands": "TC",
    "Chad": "TD",
    "French Southern Territories": "TF",
    "Togo": "TG",
    "Thailand": "TH",
    "Tajikistan": "TJ",
    "Tokelau": "TK",
    "Timor-Leste": "TL",
    "Turkmenistan": "TM",
    "Tunisia": "TN",
    "Tonga": "TO",
    "Turkey": "TR",
    "Trinidad and Tobago": "TT",
    "Tuvalu": "TV",
    "Taiwan, Province of China": "TW",
    "Tanzania, United Republic of": "TZ",
    "Ukraine": "UA",
    "Uganda": "UG",
    "United States Minor Outlying Islands": "UM",
    "United States of America": "US",
    "Uruguay": "UY",
    "Uzbekistan": "UZ",
    "Holy See": "VA",
    "Saint Vincent and the Grenadines": "VC",
    "Venezuela (Bolivarian Republic of)": "VE",
    "Virgin Islands (British)": "VG",
    "Virgin Islands (U.S.)": "VI",
    "Viet Nam": "VN",
    "Vanuatu": "VU",
    "Wallis and Futuna": "WF",
    "Samoa": "WS",
    "Yemen": "YE",
    "Mayotte": "YT",
    "South Africa": "ZA",
    "Zambia": "ZM",
    "Zimbabwe": "ZW",
}
    
# invert the dictionary
abbrev_to_country = dict(map(reversed, country_to_abbrev.items()))

## Data manipulation

In [824]:
# from ICES_Indicators excel file
stockW = ['reb.27.1-21',
'reb.27.1-2',
'bli.27.5b67',
'bli.27.5b67',
'whb.27.1-91214',
'whb.27.1-91214',
'cap.27.1-2',
'cap.27.1-2',
'cod.27.5a',
'cod.27.5a',
'cod.27.6a',
'cod.27.6a',
'cod.27.7a',
'cod.27.7a',
'cod.27.7e-k',
'cod.27.7e-k',
'cod.2127.1f14',
'cod.2127.1f14',
'cod.21.1',
'cod.21.1',
'cod.27.47d20',
'cod.27.47d20',
'cod.27.1-2',
'cod.27.1-2',
'cod.27.21',
'cod.27.21',
'cod.27.22-24',
'cod.27.22-24',
'ldb.27.8c9a',
'ldb.27.8c9a',
'reg.27.1-2',
'reg.27.1-2',
'reg.27.561214',
'reg.27.561214',
'ghl.27.561214',
'ghl.27.561214',
'had.27.5a',
'had.27.5a',
'had.27.6b',
'had.27.6b',
'had.27.7a',
'had.27.7a',
'had.27.7b-k',
'had.27.7b-k',
'had.27.46a20',
'had.27.46a20',
'had.27.1-2',
'had.27.1-2',
'hke.27.8c9a',
'hke.27.8c9a',
'hke.27.3a46-8abd',
'hke.27.3a46-8abd',
'her.27.5a',
'her.27.5a',
'her.27.nirs',
'her.27.nirs',
'her.27.6a7bc',
'her.27.6a7bc',
'her.27.irls',
'her.27.irls',
'her.27.3a47d',
'her.27.3a47d',
'her.27.1-24a514a',
'her.27.1-24a514a',
'her.27.28',
'her.27.28',
'her.27.20-24',
'her.27.20-24',
'her.27.25-2932',
'her.27.25-2932',
'her.27.3031',
'her.27.3031',
'hom.27.9a',
'hom.27.9a',
'hom.27.2a4a5b6a7a-ce-k8',
'hom.27.2a4a5b6a7a-ce-k8',
'lin.27.5a',
'lin.27.5a',
'mac.27.nea',
'mac.27.nea',
'lez.27.6b',
'lez.27.6b',
'lez.27.4a6a',
'lez.27.4a6a',
'meg.27.7b-k8abd',
'meg.27.7b-k8abd',
'meg.27.8c9a',
'meg.27.8c9a',
'pra.27.3a4a',
'pra.27.3a4a',
'pra.27.1-2',
'pra.27.1-2',
'nop.27.3a4',
'nop.27.3a4',
'ple.27.7a',
'ple.27.7a',
'ple.27.7d',
'ple.27.7d',
'ple.27.7fg',
'ple.27.7fg',
'ple.27.420',
'ple.27.420',
'ple.27.21-23',
'ple.27.21-23',
'pok.27.5a',
'pok.27.5a',
'pok.27.1-2',
'pok.27.1-2',
'pok.27.3a46',
'pok.27.3a46',
'pok.27.6',
'pok.27.6',
'san.sa.3r',
'san.sa.3r',
'san.sa.4',
'san.sa.4',
'san.sa.2r',
'san.sa.2r',
'san.sa.1r',
'san.sa.1r',
'bss.27.4bc7ad-h',
'bss.27.4bc7ad-h',
'bss.27.8ab',
'bss.27.8ab',
'sol.27.7a',
'sol.27.7a',
'sol.27.7d',
'sol.27.7d',
'sol.27.7e',
'sol.27.7e',
'sol.27.7fg',
'sol.27.7fg',
'sol.27.8ab',
'sol.27.8ab',
'sol.27.4',
'sol.27.4',
'sol.27.20-24',
'sol.27.20-24',
'spr.27.4',
'spr.27.4',
'spr.27.22-32',
'spr.27.22-32',
'usk.27.5a14',
'usk.27.5a14',
'mon.27.8c9a',
'mon.27.8c9a',
'mon.27.78abd',
'mon.27.78abd',
'whg.27.6a',
'whg.27.6a',
'whg.27.7b-ce-k',
'whg.27.7b-ce-k',
'whg.27.47d',
'whg.27.47d',
]
stockW = set(stockW)

In [6]:
# https://neweconomics.org/campaigns/landing-the-blame
pd.ExcelFile(("../data/icesTACcomparison.xlsx")).sheet_names

['Menus',
 'Table of contents',
 'ICES advice',
 'Council agreed TAC',
 'Comparison',
 'Table for results',
 'Overall results',
 'Results by Member State',
 'Sea basin',
 'Third country',
 'Results by % difference',
 'Results by # of TACs',
 'Results by # of TACs by MS',
 'Results by species',
 'ID',
 'Matching - ICES-TAC',
 'Matching - Final TACs',
 'Matching - EU share',
 'Matching - TAC split share',
 'Matching - ICES area share']

In [7]:
pd.read_excel(("../data/icesTACcomparison.xlsx"), 1)

Unnamed: 0,Tab,Description
0,ICES advice,ICES advice by TAC and year
1,Council agreed TAC,Council agreed TAC by Member State and year
2,Comparison,Comparing ICES advice and agreed TACs by Membe...
3,Overall results,Calculates the difference between TACs and ICE...
4,Results by Member State,Calculates the difference between TACs and ICE...
5,Results by % difference,Calculates the difference between TACs and ICE...
6,Results by # of TACs,Calculates the number of TACs that exceed ICES...
7,Results by # of TACs by MS,Calculates the number of TACs that exceed ICES...
8,Results by third country share,Calculates the difference between TACs and ICE...
9,Results by species,Calculates the difference between TACs and ICE...


#### TAC

In [379]:
# Can be extracted from the TAC vs Advice dataset or the TAC dataset. The latter has some more rows
tac = pd.read_excel(("../data/icesTACcomparison.xlsx"), 'Council agreed TAC')
# tac = pd.read_csv(("../data/RecordOfEuropeanTAC.csv"))

In [380]:
# extract acronym of species and convert to lower case
tac["speciesAcronym"] = tac["Reference"].str.extract(r"\(([\w\-]+)" , expand=False).str.lower()

In [381]:
# filter years of interest. tac level that we care about is TAC
tac = tac[(tac.Year.isin([2012,2016,2020])) & (tac.Level == 'TAC') ]

In [407]:
tac = tac[['Reference', 'TAC ID', 'Species', 'TAC Zone', 'Level',
       'TAC for comparison', 'Year', 'Amendment/Original', 'speciesAcronym']]

#### ICES advice

In [776]:
# From ICES official databse https://asd.ices.dk/AdviceList
sadOff = pd.read_csv(("../data/adviceICES_Data_26_04_2023.csv"))
# drop deprecated Advice 
sadOff = sadOff[sadOff['AdviceStatus'] == 'Advice'].copy()

In [777]:
# transform dates
sadOff[['AdviceApplicableFrom', 'AdviceApplicableUntil']] = sadOff[['AdviceApplicableFrom', 'AdviceApplicableUntil']].apply(pd.to_datetime, format='%d/%m/%Y')

In [778]:
# drop duplicates based on StockCode and AdviceApplicableFrom (three had the duplicates)
sadOff = sadOff.drop_duplicates(subset=['StockCode', 'AdviceApplicableFrom'], keep='last')

In [779]:
# Create the new column with years between AdviceApplicableFrom and AdviceApplicableUntil = years in which the advice is valid
date_range = lambda x: range(x['AdviceApplicableFrom'].year, x['AdviceApplicableUntil'].year+1)
sadOff = sadOff.assign(year=sadOff.apply(date_range, axis=1)).explode('year', ignore_index=True)

# keep columns of interest
sadOff = sadOff[['year', 'StockCode', 'AdviceValue', 'AdviceType', 'AdviceApplicableFrom', 'AdviceApplicableUntil',
 'AdviceValueUnit', 'AssessmentYear', 'AssessmentKey','AdviceKey', 'AdviceDOI'] ].copy()

# transform dates to year only
sadOff[['AdviceApplicableFrom', 'AdviceApplicableUntil']] = sadOff[['AdviceApplicableFrom', 'AdviceApplicableUntil']].transform(lambda x: x.dt.year) 


In [780]:
# check stock for which advice AdviceApplicableFrom is not the year before AdviceApplicableUntil
# sadOff.loc[(sadOff['AdviceApplicableFrom']  < sadOff['AdviceApplicableUntil'] - 1)] 

In [781]:
# In some cases, advice is given for two years, and the second year a new advice is given. We drop the old advice for the second year
sadOff = sadOff.drop_duplicates(subset=['StockCode', 'year'], keep='last') 
sadOff[sadOff.duplicated(subset=['StockCode', 'year'], keep=False)] 

Unnamed: 0,year,StockCode,AdviceValue,AdviceType,AdviceApplicableFrom,AdviceApplicableUntil,AdviceValueUnit,AssessmentYear,AssessmentKey,AdviceKey,AdviceDOI


In [775]:
# example of stock for which AdviceApplicableFrom is not the year before AdviceApplicableUntil
# sadOff[sadOff['StockCode'] == 'whg.27.6b'] 

In [193]:
# From Carpenter 
sad = pd.read_excel(("../data/icesTACcomparison.xlsx"), 'ICES advice')
sad = sad[sad.Year.isin([2012,2016,2020])][['ICES code', 'Advice', 'Year',
        'ICES advice', 'Catches corresponding to advice',
       'Landings corresponding to advice','Choices']]

#### TAC-SAC Comparison

In [16]:
sad_tac = pd.read_excel(("../data/icesTACcomparison.xlsx"), 'Comparison') 

In [17]:
sad_tac[(sad_tac.Species == 'Plaice') & (sad_tac.Year.isin([2012,2016,2020])) & (sad_tac.Level == 'TAC') & (sad_tac['ICES area'] == 4)  ]

Unnamed: 0,Final Ref,Reference,TAC ID,Species,TAC Zone,ICES area,Level,TAC,Year,Amendment/Original,Sea,Amendment check,Agreement,ICES Advice,TAC above advice (t),TAC above advice (%),Net difference (t),Net difference (%),Advice change ICES,Advice change %,Advice change quantity,Previous year TAC,ICES with Council
16399,(PLE/03AN.) - 2016 - TAC - Final,(PLE/03AN.) - 2016 - TAC - Original,(PLE/03AN.),Plaice,Skagerrak,4,TAC,10056.410256,2016,Original,Atlantic,Final,Norway joint management,11108.11597,0.0,0.0,-1051.705714,-0.094679,0.766839,4.512713,-9093.11597,2015.0,11108.11597
16431,(PLE/03AN.) - 2020 - TAC - Final,(PLE/03AN.) - 2020 - TAC - Original,(PLE/03AN.),Plaice,Skagerrak,4,TAC,19647.0,2020,Original,Atlantic,Final,Norway joint management,19647.0,0.0,0.0,0.0,0.0,0.170719,8.731055,-17628.0,2019.0,19647.0
16808,(PLE/2A3AX4) - 2012 - TAC - Final,(PLE/2A3AX4) - 2012 - TAC - Original,(PLE/2A3AX4),Plaice,4; Union waters of 2a; that part of 3a not cov...,4,TAC,84410.0,2012,Original,Atlantic,Final,Norway joint management,84410.0,0.0,0.0,0.0,0.0,0.314798,40.974142,-82399.0,2011.0,84410.0
16862,(PLE/2A3AX4) - 2016 - TAC - Final,(PLE/2A3AX4) - 2016 - TAC - Original,(PLE/2A3AX4),Plaice,4; Union waters of 2a; that part of 3a not cov...,4,TAC,128376.218324,2016,Original,Atlantic,Final,Norway joint management,141801.88403,0.0,0.0,-13425.665706,-0.094679,0.104583,69.373143,-139786.88403,2015.0,141801.88403
16898,(PLE/2A3AX4) - 2020 - TAC - Final,(PLE/2A3AX4) - 2020 - TAC - Original,(PLE/2A3AX4),Plaice,4; Union waters of 2a; that part of 3a not cov...,4,TAC,146852.0,2020,Original,Atlantic,Final,Norway joint management,146852.0,0.0,0.0,0.0,0.0,0.170742,71.735017,-144833.0,2019.0,146852.0


### StockAssessment

In [1031]:
# https://standardgraphs.ices.dk/stockList.aspx
stock = pd.read_csv(("../data/stockAssesment/StockAssessment.csv"), names=range(138))
stock.columns = stock.iloc[0,:]
stock = stock[1:]

  stock = pd.read_csv(("../data/stockAssesment/StockAssessment.csv"), names=range(138))


In [1032]:
# extract EN name and acronym of species
stock['enName'] = stock["StockDescription"].str.extract(r"^(.+?) ?(?:\d|\(|$)" , expand=False)
stock['speciesAcronym'] = stock["FishStock"].str.extract(r"^([^.]*).*" , expand=False)

In [1033]:
# print(stock.columns.tolist())

In [1034]:
# check what what data is in the custom columns
# stock[~stock.CustomUnits6.isna()].dropna(axis=1)

In [1035]:
# select useful years and columns
stock = stock.loc[:,['Year','enName', 'speciesAcronym','FishStock','StockKey', 'SpeciesName', "ICES Areas (splited with character '~')", 
                     'StockSize', 'StockSizeDescription', 'StockSizeUnits',
                      'FishingPressure', 'FishingPressureDescription', 'FishingPressureUnits',
                     'Flim', 'Fpa', 'Blim', 'Bpa', 'FMSY', 'MSYBtrigger', 
                     'CatchesLadingsUnits', 'Landings', 'OfficialLandings', 'Catches',
                     'Report', 'AssessmentKey','AssessmentYear']]
stock.Year = stock.Year.astype(int)
stock = stock[stock.Year.isin([2012,2016,2020])]

In [1036]:
# To merge with OfficialCatches database
# Split the areas column into several columns and melt to get one row per species-area (as per in the CatchesOfficial) 

sepAreas = stock["ICES Areas (splited with character '~')"].str.split('~', expand=True)
sepAreas = sepAreas.apply(lambda x: x.str.strip() if x.dtype == "object" else x)

#melt by leaving out all the columns containing the areas from "ICES Areas (splited with character '~')"
stockMelt = stock.join(sepAreas)
stockMelt = stockMelt.melt(id_vars=['Year', 'enName', 'speciesAcronym', 'FishStock', 'StockKey',
       'SpeciesName', "ICES Areas (splited with character '~')", 'StockSize',
       'StockSizeDescription', 'StockSizeUnits', 'FishingPressure',
       'FishingPressureDescription', 'FishingPressureUnits', 'Flim', 'Fpa',
       'Blim', 'Bpa', 'FMSY', 'MSYBtrigger', 'CatchesLadingsUnits', 'Landings',
       'OfficialLandings', 'Catches', 'Report', 'AssessmentKey','AssessmentYear'], 
       var_name='fullArea', value_name='area')
stockMelt = stockMelt.dropna(subset=['area'])


# (later realised could have done it with explode)

stockExplode = (stock.set_index(['Year', 'enName', 'speciesAcronym', 'FishStock', 'StockKey',
       'SpeciesName', 'StockSize',
       'StockSizeDescription', 'StockSizeUnits', 'FishingPressure',
       'FishingPressureDescription', 'FishingPressureUnits', 'Flim', 'Fpa',
       'Blim', 'Bpa', 'FMSY', 'MSYBtrigger', 'CatchesLadingsUnits', 'Landings',
       'OfficialLandings', 'Catches', 'Report','AssessmentKey','AssessmentYear'])
   .apply(lambda x: x.str.split('~').explode())
   .reset_index())  
stockExplode["ICES Areas (splited with character '~')"] = stockExplode["ICES Areas (splited with character '~')"].str.strip()

In [1037]:
# there are duplicates because of updated reports, we drop them keeping the last version
stockMelt = stockMelt.drop_duplicates(subset=['Year', 'area', 'FishStock'], keep='last')
# for consistency, do the same with the exploded version
stockExplode = stockExplode.drop_duplicates(subset=['Year', "ICES Areas (splited with character '~')", 'FishStock'], keep='last')

In [1038]:
# https://intercatch.ices.dk/CS/Data/Reports/StrataDefinitionAreaList.aspx
icesAreas = pd.DataFrame(pd.read_html('../data/icesAreas.html')[0])
icesAreas.columns = icesAreas.iloc[0]
icesAreas = icesAreas[1:]
icesAreas.head()

Unnamed: 0,Area,Area Description,ParentArea,Parent Area Description
1,21.1,NAFO subarea 21.1,21,Northwest Atlantic
2,21.2,NAFO subarea 21.2,21,Northwest Atlantic
3,21.3,NAFO subarea 21.3,21,Northwest Atlantic
4,21.4,NAFO subarea 21.4,21,Northwest Atlantic
5,21.5,NAFO subarea 21.5,21,Northwest Atlantic


In [1039]:
icesAreas=pd.DataFrame(["27.1.a","27.1.b","27.1_NK","27.2_NK","27.2.a.1","27.2.a.2","27.2.a_NK","27.2.b.1","27.2.b.2","27.2.b_NK",
            "27.3.a.20","27.3.a.21","27.3.a_NK","27.3_NK","27.3.b.23","27.3.c.22","27.3.d_NK","27.3.d.24","27.3.d.25",
            "27.3.d.26","27.3.d.27","27.3.d.28_NK","27.3.d.29","27.3.d.30","27.3.d.31","27.3.d.32","27.3.d.28.1","27.3.d.28.2",
            "27.4.a","27.4.b","27.4.c","27.4_NK","27.5_NK","27.5.a.1","27.5.a.2","27.5.a_NK","27.5.b.2","27.5.b_NK","27.5.b.1.a",
            "27.5.b.1.b","27.5.b.1_NK","27.6.a","27.6.b_NK","27.6_NK","27.6.b.1","27.6.b.2","27.7.a","27.7.b","27.7.c.1","27.7.c.2",
            "27.7.c_NK","27.7.d","27.7.e","27.7.f","27.7.g","27.7.h","27.7.j.1","27.7.j.2","27.7.j_NK","27.7.k.1","27.7.k.2","27.7.k_NK",
            "27.8.a","27.8.b","27.8.c","27.8.d.1","27.8.d.2","27.8.d_NK","27.8.e.1","27.8.e.2","27.8.e_NK","27.8_NK","27.9.a","27.9_NK",
            "27.9.b.1","27.9.b.2","27.9.b_NK","27.10.a.1","27.10.a.2","27.10.a_NK","27.10.b","27.10_NK","27.12.a.1","27.12.a.2","27.12.a.3",
            "27.12.a.4","27.12.a_NK","27.12.b","27.12.c","27.12_NK","27.14.a","27.14.b.1","27.14.b.2","27.14.b_NK","27.14_NK","27_NK"], columns= ['icesAreas'])

# compare the areas in the stock dataset 
areaStock = pd.DataFrame(stockMelt.area.unique(), columns=['areaStock'])

# merge
areaCompare = areaStock.merge(icesAreas, left_on='areaStock', right_on='icesAreas', how='outer', indicator=True)
areaCompare[areaCompare['_merge']=='both'].reset_index(drop=True)

Unnamed: 0,areaStock,icesAreas,_merge
0,27.4.b,27.4.b,both
1,27.4.a,27.4.a,both
2,27.5.a.1,27.5.a.1,both
3,27.3.d.28.1,27.3.d.28.1,both
4,27.3.a.20,27.3.a.20,both
...,...,...,...
61,27.7.k.2,27.7.k.2,both
62,27.12.c,27.12.c,both
63,27.2.b.2,27.2.b.2,both
64,27.8.e.1,27.8.e.1,both


In [1040]:
# Wilfried's contained stock with StockSizeDescription = SSB|SSB/B45cm|B/Bmsy|Stock Size: Relative|Spawning Stock Biomass|B_index (at least for Stockassessment of 2021)
stockSSB = pd.DataFrame(stockMelt[(stockMelt.StockSizeDescription.str.contains("SSB|SSB/B45cm|B/Bmsy|Stock Size: Relative|Spawning Stock Biomass|B_index", na=False)) | (stockMelt.StockSizeDescription.isna())]
    .drop_duplicates(subset=['FishStock'], keep='last')[['FishStock', 'StockSizeDescription']]).rename(columns={'FishStock':'FishStockSSB'})

stockW = pd.DataFrame(stockW, columns=['FishStockW'])

stockCompare = stockSSB.merge(stockW, left_on='FishStockSSB', right_on='FishStockW', how='outer', indicator=True)
stockCompare[stockCompare['_merge']=='right_only'].reset_index(drop=True)

Unnamed: 0,FishStockSSB,StockSizeDescription,FishStockW,_merge
0,,,spr.27.4,right_only
1,,,reb.27.1-21,right_only
2,,,pok.27.6,right_only
3,,,hom.27.2a4a5b6a7a-ce-k8,right_only
4,,,cap.27.1-2,right_only
5,,,cod.27.1-2,right_only
6,,,reb.27.1-2,right_only
7,,,her.27.6a7bc,right_only
8,,,had.27.1-2,right_only
9,,,pra.27.1-2,right_only


In [1041]:
# drop the "parent" areas to avoid double counting
stockMelt = stockMelt[stockMelt.area.isin(icesAreas.icesAreas)]
# keep the species by Wilfried (alternative by StockSizeDescription)
stockMelt = stockMelt[stockMelt.FishStock.isin(stockW.FishStockW)]

### OfficialNominalCatches

In [1044]:
# https://www.ices.dk/data/dataset-collections/Pages/Fish-catch-and-stock-assessment.aspx
catches = pd.read_csv(("../data\OfficialNominalCatches\ICESCatchDataset2006-2020.csv")) 

In [1045]:
# add country name column
catches['geo'] = catches.Country.map(abbrev_to_country).fillna(catches.Country)

# filter countries of interest. commented as we want to compare total catches from stockAssessment 
# catches = catches[catches.geo.isin(countries)]

# convert Species to lower case
catches.Species = catches.Species.str.lower()

# 2020 has word characters, we divide it into two 
catches[['2020','2020c']] = catches['2020'].str.split(" +",expand = True) 
catches['2020'] = catches['2020'].astype(np.float64)

# keep useful columns
catches = catches.loc[:,['Country','geo','Species','Area','2012','2016','2020']]

In [1046]:
catches = catches.melt(id_vars=['Country', 'geo','Species', 'Area'], 
       var_name='Year', value_name='CatchesCountry')

In [1055]:
catchesPivot = catches.pivot_table( columns='geo', index=['Year','Area', 'Species'] , values='CatchesCountry', aggfunc='sum').reset_index()
catchesPivot['Year'] = catchesPivot['Year'].astype(int)

# only keep the areas in the icesArea dataset

catchesPivot = catchesPivot[catchesPivot.Area.isin(icesAreas.icesAreas)]

## Merge Data

### Merge Stock with Catches

In [1058]:
# merge stock and catches data, left or right to check non-matching rows
mStockCatch = stockMelt.merge(catchesPivot, left_on=['speciesAcronym','area','Year'], 
                            right_on=['Species', 'Area', 'Year'], how='outer', indicator='_mergeCatch')

mStockCatch['sumOffCatches'] = mStockCatch[['Belgium', 'China', 'Denmark', 'Estonia', 'Faroe Islands',
       'Finland', 'France', 'Germany', 'Greenland', 'Guernsey', 'Iceland',
       'Ireland', 'Isle of Man', 'Japan', 'Jersey', 'Korea, Republic of',
       'Latvia', 'Lithuania', 'Netherlands', 'Norway', 'Poland', 'Portugal',
       'Russian Federation', 'Spain', 'Sweden', 'Taiwan, Province of China',
       'United Kingdom of GB']].sum(axis=1)

In [1059]:
mStockCatch[mStockCatch['_mergeCatch']=='right_only']

Unnamed: 0,Year,enName,speciesAcronym,FishStock,StockKey,SpeciesName,ICES Areas (splited with character '~'),StockSize,StockSizeDescription,StockSizeUnits,FishingPressure,FishingPressureDescription,FishingPressureUnits,Flim,Fpa,Blim,Bpa,FMSY,MSYBtrigger,CatchesLadingsUnits,Landings,OfficialLandings,Catches,Report,AssessmentKey,AssessmentYear,fullArea,area,Area,Species,Belgium,China,Denmark,Estonia,Faroe Islands,Finland,France,Germany,Greenland,Guernsey,Iceland,Ireland,Isle of Man,Japan,Jersey,"Korea, Republic of",Latvia,Lithuania,Netherlands,Norway,Poland,Portugal,Russian Federation,Spain,Sweden,"Taiwan, Province of China",United Kingdom of GB,_mergeCatch,sumOffCatches
1266,2012,,,,,,,,,,,,,,,,,,,,,,,,,,,,27.1.a,caa,,,,,,,0.0,,,,0.0,,,,,,,,,4.55,,,,0.02,,,0.0,right_only,4.57
1267,2012,,,,,,,,,,,,,,,,,,,,,,,,,,,,27.1.a,cab,,,,,,,,,,,0.0,,,,,,0.0,,,40.90,0.0,,,,,,,right_only,40.90
1268,2012,,,,,,,,,,,,,,,,,,,,,,,,,,,,27.1.a,cap,,,,,,,,,0.0,,,,,,,,,,,0.00,,,0.0,,,,,right_only,0.00
1269,2012,,,,,,,,,,,,,,,,,,,,,,,,,,,,27.1.a,cas,,,,0.0,,,,,0.0,,0.0,,,,,,0.0,0.0,,25.90,0.0,,,,,,,right_only,25.90
1270,2012,,,,,,,,,,,,,,,,,,,,,,,,,,,,27.1.a,cat,,,,0.0,,,,0.0,0.0,,,,,,,,,,,,,0.0,,,,,,right_only,0.00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
43447,2020,,,,,,,,,,,,,,,,,,,,,,,,,,,,27_NK,whm,,,,,,,,,,,,,,3.0,,,,,,,,,,,,,,right_only,3.00
43448,2020,,,,,,,,,,,,,,,,,,,,,,,,,,,,27_NK,wit,,,,,,,0.0,,,,,,,,,,,,0.0,,,0.0,,,0.0,,0.0,right_only,0.00
43449,2020,,,,,,,,,,,,,,,,,,,,,,,,,,,,27_NK,wra,,,,,,,0.0,,,,,,,,,,,,,,,,,,,,,right_only,0.00
43450,2020,,,,,,,,,,,,,,,,,,,,,,,,,,,,27_NK,wrf,,,,,,,0.0,,,,,,,,,,,,,,,,,,,,,right_only,0.00


In [1060]:
print(
len(stockMelt), 
len(catchesPivot), 
len(mStockCatch[mStockCatch._mergeCatch == 'both']),
len(mStockCatch[mStockCatch._mergeCatch == 'left_only']),
len(mStockCatch[mStockCatch._mergeCatch == 'right_only'])
)

1266 43308 1149 117 42186


In [1051]:
# Check difference between total catches from OfficialCatches and StockAssessment
mStockCatch.Catches = mStockCatch.Catches.astype(np.float64)
mStockCatch['diffCatches'] = (mStockCatch.sumOffCatches - mStockCatch.Catches)/mStockCatch.Catches
# mStockCatch.to_csv('..\dataTemp\checkCatches.csv', index=False)	
mStockCatch['diffCatches'].describe()

count    1045.000000
mean       -0.741492
std         2.349157
min        -1.000000
25%        -1.000000
50%        -0.999473
75%        -0.930598
max        59.285888
Name: diffCatches, dtype: float64

### Merge StockCatches with TAC
To merge the TAC with the advice, we use the dictionary by Carpenter. 

In [378]:
dictTACSAD = pd.read_excel(("../data/icesTACcomparison.xlsx"), 'Matching - ICES-TAC') 
dictTACSAD

Unnamed: 0,TAC ID,2020,2019,2018,2017,2016,2015,2014,2013,2012,2011,2010,2009,2008,2007,2006,2005,2004,2003,2002,2001
0,(ALF/3X14-),alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea,alf.27.nea
1,(ANE/08.),ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8,ane.27.8
2,(ANE/9/3411),ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a,ane.27.9a
3,(ANF/04-N.),anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46,anf.27.3a46
4,(ANF/07.),mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd,mon.27.78abd + ank.27.78abd
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
237,(WHG/08.),whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a,whg.27.89a
238,(WHG/2AC4.),whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d,whg.27.47d
239,(WHG/56-14),whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b,whg.27.6a + whg.27.6b
240,(WHG/7X7A-C),whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.,whg.27.7b-ce-k.


In [160]:
dictTACSAD = pd.read_excel(("../data/icesTACcomparison.xlsx"), 'Matching - ICES-TAC') 
# filter years and melt
dictTACSAD = dictTACSAD[['TAC ID', 2012, 2016, 2020]]
dictTACSAD = dictTACSAD.melt(id_vars=['TAC ID'], var_name='Year', value_name='FishStock').copy()
# explode and then merge with with original to know if TAC applies to more than one stock
dictTACSADexplode = dictTACSAD.set_index(['TAC ID', 'Year']).apply(lambda x: x.str.split('+').explode()).reset_index()
dictTACSADexplode = dictTACSADexplode.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
dictTACSAD = dictTACSAD.merge(dictTACSADexplode, on=['TAC ID', 'Year'])
#rename all columns of dictTACSAD
dictTACSAD.columns = ['TAC ID', 'Year', 'FishStockTAC', 'FishStock']

In [163]:
# convert Year to float (Year in mStockCatchSADoff is float because of NaNs)
dictTACSAD.Year = dictTACSAD.Year.astype(np.float64)

In [169]:
# drop non-matching rows from previous merge
mStockCatch = mStockCatch[mStockCatch._mergeCatch == 'both'].copy()

mStockCatchTAC = mStockCatch.merge(dictTACSAD, on=['FishStock', 'Year'], how='outer', indicator='_mergeDictTAC')

In [170]:
mStockCatchTAC = mStockCatchTAC.merge(tac, on=['TAC ID', 'Year'], how='left', indicator='_mergeTAC')

In [383]:
mStockCatchTAC

Unnamed: 0,Year,enName,speciesAcronym_x,FishStock,StockKey,SpeciesName,ICES Areas (splited with character '~'),StockSize,StockSizeDescription,StockSizeUnits,FishingPressure,FishingPressureDescription,FishingPressureUnits,Flim,Fpa,Blim,Bpa,FMSY,MSYBtrigger,CatchesLadingsUnits,Landings,OfficialLandings,Catches,Report,AssessmentKey,AssessmentYear,fullArea,area,Area,Species_x,Belgium,China,Denmark,Estonia,Faroe Islands,Finland,France,Germany,Greenland,Guernsey,Iceland,Ireland,Isle of Man,Japan,Jersey,"Korea, Republic of",Latvia,Lithuania,Netherlands,Norway,Poland,Portugal,Russian Federation,Spain,Sweden,"Taiwan, Province of China",United Kingdom of GB,_mergeCatch,sumOffCatches,diffCatches,TAC ID,FishStockTAC,_mergeDictTAC,Reference,Species_y,TAC Zone,Level,TAC for comparison,Amendment/Original,speciesAcronym_y,_mergeTAC
0,2012.0,Sandeel,san,san.sa.1r,169246,Ammodytes,27.4.b ~ 27.4.c,152970,SSB,tonnes,0.112,F,Year-1,,,110000,145000,,,tonnes,,,45954.0,https://doi.org/10.17895/ices.advice.10000,16936,2022,0,27.4.b,27.4.b,san,,,50064.15,,0.0,,0.00,1708.44,,,,,,,,,,0.0,317.0,42144.09,,,,,5652.0,,0.0,both,99885.68,1.173601,,,left_only,,,,,,,,left_only
1,2012.0,Sandeel,san,san.sa.1r,169246,Ammodytes,27.4.b ~ 27.4.c,152970,SSB,tonnes,0.112,F,Year-1,,,110000,145000,,,tonnes,,,45954.0,https://doi.org/10.17895/ices.advice.10000,16936,2022,1,27.4.c,27.4.c,san,,,0.00,,,,3.28,0.00,,,,,,,,,,0.0,0.0,0.00,,,,,0.0,,0.0,both,3.28,-0.999929,,,left_only,,,,,,,,left_only
2,2012.0,Sandeel,san,san.sa.2r,169247,Ammodytes,27.4.b ~ 27.4.c,42319,SSB,tonnes,0.153,F,ratio,,,56000,84000,,,tonnes,,,12672.0,https://doi.org/10.17895/ices.advice.10001,16937,2022,0,27.4.b,27.4.b,san,,,50064.15,,0.0,,0.00,1708.44,,,,,,,,,,0.0,317.0,42144.09,,,,,5652.0,,0.0,both,99885.68,6.882393,,,left_only,,,,,,,,left_only
3,2012.0,Sandeel,san,san.sa.2r,169247,Ammodytes,27.4.b ~ 27.4.c,42319,SSB,tonnes,0.153,F,ratio,,,56000,84000,,,tonnes,,,12672.0,https://doi.org/10.17895/ices.advice.10001,16937,2022,1,27.4.c,27.4.c,san,,,0.00,,,,3.28,0.00,,,,,,,,,,0.0,0.0,0.00,,,,,0.0,,0.0,both,3.28,-0.999741,,,left_only,,,,,,,,left_only
4,2012.0,Sandeel,san,san.sa.4,169249,Ammodytes,27.4.a ~ 27.4.b,101114,SSB,tonnes,0.022,F,Year-1,,,48000,102000,,,tonnes,,,2618.0,https://doi.org/10.17895/ices.advice.10003,16940,2022,1,27.4.b,27.4.b,san,,,50064.15,,0.0,,0.00,1708.44,,,,,,,,,,0.0,317.0,42144.09,,,,,5652.0,,0.0,both,99885.68,37.153430,,,left_only,,,,,,,,left_only
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5761,2020.0,Black-bellied anglerfish,ank,ank.27.78abd,195875,Lophius budegassa,27.7.a ~ 27.7.b ~ 27.7.c.1 ~ 27.7.c.2 ~ 27.7.d...,42068.0,Combined-sex SSB,tonnes,0.09226,,,,0.257,12073.0,16776.0,0.163,16776.0,tonnes,8676.0,,9601.0,https://doi.org/10.17895/ices.advice.21394104.v2,17653,2022,13,27.8.a,27.8.a,ank,,,,,,,,,,,,,,,,,,,,,,,,0.25,,,,both,0.25,-0.999974,(ANF/8ABDE.),mon.27.78abd + ank.27.78abd,both,(ANF/8ABDE.) - 2020 - TAC - Original,Anglerfish,"8a, 8b, 8d and 8e",TAC,9458,Original,anf,both
5762,2020.0,Black-bellied anglerfish,ank,ank.27.78abd,195875,Lophius budegassa,27.7.a ~ 27.7.b ~ 27.7.c.1 ~ 27.7.c.2 ~ 27.7.d...,42068.0,Combined-sex SSB,tonnes,0.09226,,,,0.257,12073.0,16776.0,0.163,16776.0,tonnes,8676.0,,9601.0,https://doi.org/10.17895/ices.advice.21394104.v2,17653,2022,14,27.8.b,27.8.b,ank,,,,,,,,,,,,,,,,,,,,,,0.0,,0.53,,,,both,0.53,-0.999945,(ANF/07.),mon.27.78abd + ank.27.78abd,both,(ANF/07.) - 2020 - TAC - Original,Anglerfish,7,TAC,35299,Original,anf,both
5763,2020.0,Black-bellied anglerfish,ank,ank.27.78abd,195875,Lophius budegassa,27.7.a ~ 27.7.b ~ 27.7.c.1 ~ 27.7.c.2 ~ 27.7.d...,42068.0,Combined-sex SSB,tonnes,0.09226,,,,0.257,12073.0,16776.0,0.163,16776.0,tonnes,8676.0,,9601.0,https://doi.org/10.17895/ices.advice.21394104.v2,17653,2022,14,27.8.b,27.8.b,ank,,,,,,,,,,,,,,,,,,,,,,0.0,,0.53,,,,both,0.53,-0.999945,(ANF/8ABDE.),mon.27.78abd + ank.27.78abd,both,(ANF/8ABDE.) - 2020 - TAC - Original,Anglerfish,"8a, 8b, 8d and 8e",TAC,9458,Original,anf,both
5764,2020.0,Black-bellied anglerfish,ank,ank.27.78abd,195875,Lophius budegassa,27.7.a ~ 27.7.b ~ 27.7.c.1 ~ 27.7.c.2 ~ 27.7.d...,42068.0,Combined-sex SSB,tonnes,0.09226,,,,0.257,12073.0,16776.0,0.163,16776.0,tonnes,8676.0,,9601.0,https://doi.org/10.17895/ices.advice.21394104.v2,17653,2022,16,27.8.d.2,27.8.d.2,ank,,,,,,,,,,,,,,,,,,,,,,0.0,,0.00,,,,both,0.00,-1.000000,(ANF/07.),mon.27.78abd + ank.27.78abd,both,(ANF/07.) - 2020 - TAC - Original,Anglerfish,7,TAC,35299,Original,anf,both


In [172]:
print(
len(mStockCatch), 
len(tac), 
len(mStockCatchTAC[mStockCatchTAC._mergeDictTAC == 'both']),
len(mStockCatchTAC[mStockCatchTAC._mergeDictTAC == 'left_only']),
len(mStockCatchTAC[mStockCatchTAC._mergeDictTAC == 'right_only'])
)

2787 662 4489 1277 399


### Merge StockCatchesTAC with SAD (Official)

In [174]:
# drop non-matching rows from previoys merge 
mStockCatchTAC = mStockCatchTAC[mStockCatchTAC._mergeCatch == 'both'].copy()

sadOff.year = sadOff.year.astype(np.float64)

# merge with SAD using StockCode and year
mStockCatchTACSADoff = mStockCatchTAC.merge(sadOff, left_on=['FishStock','Year'],
                                             right_on=['StockCode', 'year'],
                                              how='outer', indicator='_mergeSADoff')

In [175]:
print(
len(mStockCatch), 
len(sadOff), 
len(mStockCatchTACSADoff[mStockCatchTACSADoff._mergeSADoff == 'both']),
len(mStockCatchTACSADoff[mStockCatchTACSADoff._mergeSADoff == 'left_only']),
len(mStockCatchTACSADoff[mStockCatchTACSADoff._mergeSADoff == 'right_only']),
)

2787 834 1439 4373 730


In [179]:
exportMergeAll = mStockCatchTACSADoff[(mStockCatchTACSADoff._mergeSADoff == 'both') & (mStockCatchTACSADoff._mergeTAC == 'both')]	
exportMergeAll.to_csv('..\dataTemp\StockCatchSADoffTAC.csv', index=False)

### Merge StockCatchesTAC with SAD (Carpenter)

In [None]:
sad.year = sad.year.astype(np.float64)

# merge with SAD using StockCode and year
mStockCatchTACsad = mStockCatchTAC.merge(sad, left_on=['FishStock','Year'],
                                             right_on=['ICES code', 'Year'],
                                              how='outer', indicator='_mergeSAD')

In [None]:
print(
len(mStockCatch), 
len(sadOff), 
len(mStockCatchTACSADoff[mStockCatchTACSADoff._mergeSADoff == 'both']),
len(mStockCatchTACSADoff[mStockCatchTACSADoff._mergeSADoff == 'left_only']),
len(mStockCatchTACSADoff[mStockCatchTACSADoff._mergeSADoff == 'right_only']),
)

2787 834 1439 4373 730


In [None]:
exportMergeAll = mStockCatchTACSADoff[(mStockCatchTACSADoff._mergeSADoff == 'both') & (mStockCatchTACSADoff._mergeTAC == 'both')]	
exportMergeAll.to_csv('..\dataTemp\StockCatchSADoffTAC.csv', index=False)

## Indicators calculation

### TAC/SAD

## Trash

In [None]:
sepAreas = stock["ICES Areas (splited with character '~')"].str.split('~', expand=True)
sepAreas = sepAreas.apply(lambda x: x.str.strip() if x.dtype == "object" else x)

In [None]:
list1 = stock["ICES Areas (splited with character '~')"].to_list()

newList= []
for item in list1:
    if(str(item) != "nan"):
        newList.append(item)

max_len = -1
for ele in newList: 
    if(len(ele) > max_len): 
        max_len = len(ele) 
        res = ele 
        
print("Longest String is : ", res) 

Longest String is :  27.1.a ~ 27.1.b ~ 27.10.a.1 ~ 27.10.a.2 ~ 27.10.b ~ 27.12.a.1 ~ 27.12.a.2 ~ 27.12.a.3 ~ 27.12.a.4 ~ 27.12.b ~ 27.12.c ~ 27.14.a ~ 27.14.b.1 ~ 27.14.b.2 ~ 27.2.a.1 ~ 27.2.a.2 ~ 27.2.b.1 ~ 27.2.b.2 ~ 27.3.a ~ 27.3.b.23 ~ 27.3.c.22 ~ 27.3.d.24 ~ 27.3.d.25 ~ 27.3.d.26 ~ 27.3.d.27 ~ 27.3.d.28.1 ~ 27.3.d.28.2 ~ 27.3.d.29 ~ 27.3.d.30 ~ 27.3.d.31 ~ 27.3.d.32 ~ 27.4.a ~ 27.4.b ~ 27.4.c ~ 27.5.a.1 ~ 27.5.a.2 ~ 27.5.b.1.a ~ 27.5.b.1.b ~ 27.5.b.2 ~ 27.6.a ~ 27.6.b.1 ~ 27.6.b.2 ~ 27.7.a ~ 27.7.b ~ 27.7.c.1 ~ 27.7.c.2 ~ 27.7.d ~ 27.7.e ~ 27.7.f ~ 27.7.g ~ 27.7.h ~ 27.7.j.1 ~ 27.7.j.2 ~ 27.7.k.1 ~ 27.7.k.2 ~ 27.8.a ~ 27.8.b ~ 27.8.c ~ 27.8.d.1 ~ 27.8.d.2 ~ 27.8.e.1 ~ 27.8.e.2 ~ 27.9.a ~ 27.9.b.1 ~ 27.9.b.2


In [None]:
pleStock = stock[stock.FishStock == 'san.sa.1r']
pleStock = pleStock["ICES Areas (splited with character '~')"].str.split('~', expand=True).iloc[0,:].to_list()
pleStock = [s.strip() for s in pleStock]

In [None]:
denmarkPle = catches[(catches['Area'].isin(pleStock))  & (catches['Species'] == 'san')].copy()
denmarkPle['2020'] = denmarkPle['2020'].astype(np.float64)
denmarkPle

KeyError: '2020'