# EU National Emission Ceilings (NEC) Directive Inventory

Data from 1990 to 2021

Source : https://www.eea.europa.eu/data-and-maps/data/ds_resolveuid/2BFB06C9-AB28-41EC-9808-576E32A36410

By : Robin Lotode

Setup : Download CSV file from Source link, add it in project/data/raw

Documentation : (from source file NEC_NFR19_2023_2023.xlsx)

National Emission Ceilings (NEC) Directive Inventory - NFR19 sector classification - feature catalogue

|Name |	Definition |
|---|---|
|Emissions  (float(8)) |	Emission value. |
|Country_code  (varchar(4)) |	International Country Code. Note: ISO 3166-1-Alpha-2 code elements |
|Country  (varchar(53)) |	Country name. |
|Pollutant_name  (varchar(20)) |	Short name of pollutant. Note: NH3, NMVOC, NOX, SO2 |
|Format_name  (varchar(100)) |	Name of guideline. Note: NFR19 sector classification |
|Sector_code  (varchar(15)) |	Sector code. Note: NFR19 sector classification |
|Parent_sector_code  (varchar(15)) |	Parent sector code. Note: NFR09 sector classification |
|Sector_name  (varchar(75)) |	Sector name. Note: NFR19 sector classification |
|Year  (varchar(4)) |	Annual data. Note: 1990-2021 |
|Unit  (varchar(40)) |	Emission unit. Note: Kilotonne (1000 tonnes) |
|Notation  (varchar(40)) |	Notation key. |


## Imports

In [74]:
import pandas as pd
from IPython.core.interactiveshell import InteractiveShell
import plotly
pd.options.plotting.backend = "plotly"
pd.set_option('expand_frame_repr', False)

InteractiveShell.ast_node_interactivity = "all"

## Opening the dataset

In [75]:
df = pd.read_csv("../data/raw/NEC_NFR19_2023_23.06.27.csv", delimiter="\t")
df = df.rename(str.lower, axis='columns')
df.dtypes
df.shape
df.head()

country_code           object
country                object
pollutant_name         object
format_name            object
sector_code            object
parent_sector_code     object
sector_name            object
year                    int64
emissions             float64
unit                   object
notations              object
dtype: object

(4430070, 11)

Unnamed: 0,country_code,country,pollutant_name,format_name,sector_code,parent_sector_code,sector_name,year,emissions,unit,notations
0,EE,Estonia,SO2,NEC NFR14 sector classification,1A4bii,NATIONAL TOTAL,Residential: Household and gardening (mobile),2002,0.00518,Gg (1000 tonnes),
1,MT,Malta,"Indeno (1,2,3-cd) Pyrene",NEC NFR14 sector classification,1A5a,NATIONAL TOTAL,Other stationary (including military),1997,,t,NO
2,RO,Romania,BC,NEC NFR14 sector classification,2D3b,NATIONAL TOTAL,Road paving with asphalt,2021,0.008868,Gg (1000 tonnes),
3,FR,France,PM2.5,NEC NFR14 sector classification,1A4bii,NATIONAL TOTAL,Residential: Household and gardening (mobile),1982,,Gg (1000 tonnes),NR
4,DE,Germany,BC,NEC NFR14 sector classification,1A4cii,NATIONAL TOTAL,Agriculture/Forestry/Fishing: Off-road vehicle...,2005,3.09385,Gg (1000 tonnes),


## Checking values

### Missing values

In [76]:
print("Missing values :")
df.isnull().sum()

df_dict = {}
for col in df.columns:
    if col != "emissions":
        print(f'{col} unique values : {len(list(df[col].unique()))}')
        df_dict[col] = df[col].value_counts()

Missing values :


country_code                0
country                     0
pollutant_name              0
format_name                 0
sector_code                 0
parent_sector_code     262565
sector_name                 0
year                        0
emissions             3169885
unit                        0
notations             2887851
dtype: int64

country_code unique values : 28
country unique values : 28
pollutant_name unique values : 31
format_name unique values : 1
sector_code unique values : 139
parent_sector_code unique values : 2
sector_name unique values : 136
year unique values : 42
unit unique values : 5
notations unique values : 12


The column "format_name" has one single unique value, so we can ignore it

In [None]:
df = df.drop("format_name")

In [77]:
for key, value in df_dict.items():
    if key in ["unit", "notations"]:
        print(key, ":", value, "\n")

unit : unit
t                   1729350
Gg (1000 tonnes)    1712520
TJNCV                617625
kg                   247050
g I-TEQ              123525
Name: count, dtype: int64 

notations : notations
NO    802444
NE    340031
NR    200782
IE    197660
??       563
C        451
N.        98
N/        64
N?        64
?.        61
Na         1
Name: count, dtype: int64 



Notation keys appear when the Emissions value is missing, explaining the reason why.

Notation keys : (from https://unfccc.int/files/national_reports/annex_i_ghg_inventories/reporting_requirements/application/pdf/crf_reporter_user_manual.pdf#page=74)
|Key | Meaning |
|---|---|
|NO | not occurring |
|NE | not estimated |
|NA | not applicable |
|IE | included elsewhere |
|C | Confidential |

Some notation key values aren't in the expected values [NO, NE, NR, IE, C] :

In [78]:
df[(df["notations"].notnull()) & (~df["notations"].isin(["NO", "NE", "NR", "IE", "C"]))]

Unnamed: 0,country_code,country,pollutant_name,format_name,sector_code,parent_sector_code,sector_name,year,emissions,unit,notations
7352,GR,Greece,PM2.5,NEC NFR14 sector classification,1B2aiv,NATIONAL TOTAL,Fugitive emissions oil: Refining / storage,2017,,Gg (1000 tonnes),??
15048,GR,Greece,NH3,NEC NFR14 sector classification,1A4ai,NATIONAL TOTAL,Commercial/institutional: Stationary,2005,,Gg (1000 tonnes),??
22490,GR,Greece,NOx,NEC NFR14 sector classification,1B2aiv,NATIONAL TOTAL,Fugitive emissions oil: Refining / storage,2012,,Gg (1000 tonnes),??
27226,LT,Lithuania,NOx,NEC NFR14 sector classification,3Dc,NATIONAL TOTAL,Farm-level agricultural operations including s...,2001,,Gg (1000 tonnes),??
32754,GR,Greece,Se,NEC NFR14 sector classification,1A3ai(ii),,International aviation cruise (civil),2012,,t,N.
...,...,...,...,...,...,...,...,...,...,...,...
4403326,GR,Greece,NOx,NEC NFR14 sector classification,1B2aiv,NATIONAL TOTAL,Fugitive emissions oil: Refining / storage,2017,,Gg (1000 tonnes),??
4406197,GR,Greece,SO2,NEC NFR14 sector classification,1B2aiv,NATIONAL TOTAL,Fugitive emissions oil: Refining / storage,2013,,Gg (1000 tonnes),??
4414345,GR,Greece,Cu,NEC NFR14 sector classification,1A3aii(ii),,Domestic aviation cruise (civil),2012,,t,N.
4418311,GR,Greece,TSP,NEC NFR14 sector classification,1B2aiv,NATIONAL TOTAL,Fugitive emissions oil: Refining / storage,2021,,Gg (1000 tonnes),??


In [79]:
print("pollutants", *df_dict["pollutant_name"].index, sep=" | ")
print("sectors", *df_dict["sector_name"].index, sep=" | ")

pollutants | SO2 | NH3 | PM2.5 | NMVOC | NOx | Biomass | Other Fuels | PM10 | Pb | Cd | Cu | HCB | PCBs | Hg | Se | Benzo(b) Fluoranthene | Benzo(a) Pyrene | Solid Fuels | As | Ni | Indeno (1,2,3-cd) Pyrene | CO | Liquid Fuels | PCDD/PCDF (dioxins/furans) | Zn | Cr | Gaseous Fuels | TSP | Total PAHs | BC | benzo(k) Fluoranthene
sectors | NATIONAL TOTAL FOR COMPLIANCE | Residential: Household and gardening (mobile) | Agriculture/Forestry/Fishing: Stationary | Carbide production | Fugitive emissions oil: Refining / storage | Open burning of waste | Titanium dioxide production | Manure management - Dairy cattle  | Off-farm storage, handling and transport of bulk agricultural products | Industrial wastewater handling | Cement production | Manure management - Sheep | Other mineral products | Food and beverages industry  | Railways | Nickel production | National navigation (shipping) | Wood processing | Other product use | Domestic solvent use including fungicides | Degreasing | Quarrying an

In [98]:
pollutants_df = pd.DataFrame(columns=["unit"])
pollutants_df.index.name = "pollutant_name"
for poll in df_dict["pollutant_name"].index:
    val_count = df[df["pollutant_name"] == poll]["unit"].value_counts()
    if len(val_count) != 1:
        print(poll, val_count)
    pollutants_df.loc[poll] = val_count.index[0]
pollutants_df


Unnamed: 0_level_0,unit
pollutant_name,Unnamed: 1_level_1
SO2,Gg (1000 tonnes)
NH3,Gg (1000 tonnes)
PM2.5,Gg (1000 tonnes)
NMVOC,Gg (1000 tonnes)
NOx,Gg (1000 tonnes)
Biomass,TJNCV
Other Fuels,TJNCV
PM10,Gg (1000 tonnes)
Pb,t
Cd,t


## Plotting

In [81]:
pollutant = "PM2.5"
country = "France"
data = df[(df["pollutant_name"] == pollutant) & (df["country"] == country)]

top_avg = data.groupby("sector_name")["emissions"].mean().sort_values()
top_avg

sector_name
Lead production                                      0.000005
Nickel production                                    0.001451
Manure management - Mules and asses                  0.001567
Other industrial processes                           0.002228
Manure management - Other animals                    0.004793
                                                       ...   
Storage, handling and transport of metal products         NaN
Urine and dung deposited by grazing animals               NaN
Use of pesticides                                         NaN
Volcanoes                                                 NaN
Zinc production                                           NaN
Name: emissions, Length: 136, dtype: float64

In [82]:
data.sort_values("year")
data.plot(kind="bar", x="Year", y=["emissions"], color="sector_name", title=f"{pollutant} emissions per sector in {country}")

Unnamed: 0,country_code,country,pollutant_name,format_name,sector_code,parent_sector_code,sector_name,year,emissions,unit,notations
2398614,FR,France,PM2.5,NEC NFR14 sector classification,3Da1,NATIONAL TOTAL,Inorganic N-fertilizers (includes also urea ap...,1980,,Gg (1000 tonnes),NR
3069188,FR,France,PM2.5,NEC NFR14 sector classification,1A3biii,NATIONAL TOTAL,NATIONAL TOTAL FOR COMPLIANCE,1980,,Gg (1000 tonnes),NR
1576300,FR,France,PM2.5,NEC NFR14 sector classification,5C1bv,NATIONAL TOTAL,Cremation,1980,,Gg (1000 tonnes),NR
3075490,FR,France,PM2.5,NEC NFR14 sector classification,2A3,NATIONAL TOTAL,Glass production,1980,,Gg (1000 tonnes),NR
3076566,FR,France,PM2.5,NEC NFR14 sector classification,1B2aiv,NATIONAL TOTAL,NATIONAL TOTAL FOR COMPLIANCE,1980,,Gg (1000 tonnes),NR
...,...,...,...,...,...,...,...,...,...,...,...
196960,FR,France,PM2.5,NEC NFR14 sector classification,2L,NATIONAL TOTAL,"Other production, consumption, storage, transp...",2021,,Gg (1000 tonnes),
1834867,FR,France,PM2.5,NEC NFR14 sector classification,3Dc,NATIONAL TOTAL,Farm-level agricultural operations including s...,2021,0.842860,Gg (1000 tonnes),
1820979,FR,France,PM2.5,NEC NFR14 sector classification,1A3c,NATIONAL TOTAL,Railways,2021,0.547254,Gg (1000 tonnes),
112123,FR,France,PM2.5,NEC NFR14 sector classification,3Da1,NATIONAL TOTAL,Inorganic N-fertilizers (includes also urea ap...,2021,,Gg (1000 tonnes),NE


ValueError: Value of 'x' is not the name of a column in 'data_frame'. Expected one of ['country_code', 'country', 'pollutant_name', 'format_name', 'sector_code', 'parent_sector_code', 'sector_name', 'year', 'emissions', 'unit', 'notations'] but received: Year

Unnamed: 0,Country_Code,Country,Pollutant_name,Format_name,sector_code,parent_sector_code,sector_name,Year,Emissions,Unit,Notations
1,MT,Malta,"Indeno (1,2,3-cd) Pyrene",NEC NFR14 sector classification,1A5a,NATIONAL TOTAL,Other stationary (including military),1997,,t,NO
3,FR,France,PM2.5,NEC NFR14 sector classification,1A4bii,NATIONAL TOTAL,Residential: Household and gardening (mobile),1982,,Gg (1000 tonnes),NR
6,NL,Netherlands,Total PAHs,NEC NFR14 sector classification,6A,NATIONAL TOTAL,Other (included in national total for entire t...,2011,,t,NE
12,LT,Lithuania,NOx,NEC NFR14 sector classification,1A5a,NATIONAL TOTAL,NATIONAL TOTAL FOR COMPLIANCE,2003,,Gg (1000 tonnes),NE
17,DK,Denmark,PM2.5,NEC NFR14 sector classification,1A5c,,Multilateral operations,1995,,Gg (1000 tonnes),NE
...,...,...,...,...,...,...,...,...,...,...,...
4430049,SE,Sweden,NH3,NEC NFR14 sector classification,11C,,Other natural emissions,2008,,Gg (1000 tonnes),NO
4430052,FR,France,SO2,NEC NFR14 sector classification,2C4,NATIONAL TOTAL,Magnesium production,1999,,Gg (1000 tonnes),IE
4430058,HR,Croatia,SO2,NEC NFR14 sector classification,1A5b,NATIONAL TOTAL,NATIONAL TOTAL FOR COMPLIANCE,2000,,Gg (1000 tonnes),IE
4430059,MT,Malta,Benzo(b) Fluoranthene,NEC NFR14 sector classification,1A2a,NATIONAL TOTAL,Stationary combustion in manufacturing industr...,2011,,t,NO
