# Statistics Explained  articles and matches with OECD's Glossary

## Objective: to build a common vocabulary and construct "profiles" of the terms of this vocabulary

### Installation instructions
*    Download the notebook as "raw" file and save it with extension .ipynb (cut the .txt extension which is added)
*    Install the necessary libraries from your jupyter command prompt. These, together with the versions used, are:
    *    pyodbc==4.0.32
    *    spacy==3.2.1 
    *    pandas==1.3.5
    *    nltk==3.6.5
*   Launch the notebook and put your own credentials for access to the Virtuoso database in the call to pyodbc.connect() in the chunk with title "Connect to the database"    


### The first time download the Spacy's library. Comment-out in subsequent runs

* Note that we are using Spacy's "large" library.

In [1]:
import re
import pandas as pd
import spacy
import sys
from collections import Counter

## Run to install the language library, then comment-out
!{sys.executable} -m spacy download en_core_web_lg

nlp = spacy.load('en_core_web_lg')
nlp.max_length = 1500000
print('Finished loading.')


Collecting en-core-web-lg==3.2.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.2.0/en_core_web_lg-3.2.0-py3-none-any.whl (777.4 MB)
[+] Download and installation successful
You can now load the package via spacy.load('en_core_web_lg')
Finished loading.


In [2]:
from datetime import datetime

def file_name(pre,ext):
    current_time = datetime.now() 
    return pre + '_'+ str(current_time.month)+ '_' + str(current_time.day) + \
                 '_' + str(current_time.hour)+ '_' + str(current_time.minute)  +'.'+ext

### Connect to the database

In [3]:
import pyodbc
c = pyodbc.connect('DSN=Virtuoso All;DBA=ESTAT;UID=xxxxx;PWD=xxxxx')
cursor = c.cursor()

In [4]:
import re
#import unicodedata as ud

def clean(x, quotes=True):
    if pd.isnull(x): return x  
    x = x.strip()
    
    ## make letter-question mark-letter -> letter-quote-space-letter !!! but NOT in the lists of URLs!!!
    if quotes:
        x = re.sub(r'([A-Za-z])\?([A-Za-z])','\\1\' \\2',x) 
    
    ## make letter-question mark-space lower case letter letter-quote-space letter
    x = re.sub(r'([A-Za-z])\? ([a-z])','\\1\' \\2',x) 

    ## delete ,000 commas in numbers    
    x = re.sub(r'\b(\d+),(\d+)\b','\\1\\2',x) ## CORRECTED
    
    ## delete  000 spaces in numbers
    x = re.sub(r'\b(\d+) (\d+)\b','\\1\\2',x) ## CORRECTED
    
    ## remove more than one spaces
    x = re.sub(r' +', ' ',x)
    
    ## remove start and end spaces
    x = re.sub(r'^ +| +$', '',x,flags=re.MULTILINE) 
    
    ## space-comma -> comma
    x = re.sub(r' \,',',',x)
    
    ## space-dot -> dot
    x = re.sub(r' \.','.',x)
    
    x = re.sub(r'â.{2}',"'",x) ### !!! NEW: single quotes are read as: âXX
    
    #x = x.encode('latin1').decode('utf-8') ## â\x80\x99
    #x = ud.normalize('NFKD',x).encode('ascii', 'ignore').decode()
    
    return x

### Statistics explained articles

* IDs, titles from dat_link_info, with resource_information_id=1, i.e. Eurostat (see ESTAT.V1.mod_resource_information) and matching IDs from dat_article.
* Carry out data cleansing on titles.


In [5]:
SQLCommand = """SELECT id, title 
                FROM ESTAT.V1.dat_link_info 
                WHERE resource_information_id=1 AND id IN (SELECT id FROM ESTAT.V1.dat_article) """

SE_df = pd.read_sql(SQLCommand,c)

SE_df['title'] = SE_df['title'].apply(clean)
SE_df.head(5)


Unnamed: 0,id,title
0,7,Accidents at work statistics
1,13,National accounts and GDP
2,16,Railway safety statistics in the EU
3,17,Railway freight transport statistics
4,18,Railway passenger transport statistics - quart...


### Add paragraphs titles and contents

* From dat_article_paragraph with abstract=0 (i.e. "no").
* Match article_id from dat_article_paragraph with id from dat_article.
* Carry out data cleansing on titles and paragraph contents.

In [6]:
SQLCommand = """SELECT article_id, title, content 
                FROM ESTAT.V1.dat_article_paragraph
                WHERE abstract=0 AND article_id IN (SELECT id FROM ESTAT.V1.dat_article) """

add_content = pd.read_sql(SQLCommand,c)
add_content['title'] = add_content['title'].apply(clean)
add_content['content'] = add_content['content'].apply(clean)
add_content

Unnamed: 0,article_id,title,content
0,2905,Absences from work sharply increase in first h...,Absences from work recorded unprecedented high...
1,2905,Absences: 9.5 % of employment in Q4 2019 and 1...,The article's next figure (Figure 4) compares ...
2,2905,Higher share of absences from work among women...,"Considering all four quarters of 2020, the sha..."
3,2905,Absences from work due to own illness or disab...,"From Q4 2019 to Q4 2020, the number of people ..."
4,2905,Absences from work due to holidays,"Expressed as a share of employed people, absen..."
...,...,...,...
3854,10539,General presentation and definition,Scope of asylum statistics and Dublin statisti...
3855,10539,Methodological aspects in asylum statistics,Annual aggregate of the number of asylum appli...
3856,10539,Methodological aspects in Dublin statistics,Asymmetries For most of the collected Dublin s...
3857,10539,What questions can or cannot be answered with ...,How many asylum seekers are entering EU Member...


### Aggregate above paragraph titles and contents  from SE articles paragraphs by article id

* Create a column _raw content_ which gathers all paragraph titles and contents in one text per article.

In [7]:
add_content_grouped = add_content.groupby(['article_id'])[['title','content']].aggregate(lambda x: list(x))
add_content_grouped.reset_index(drop=False, inplace=True)
for i in range(len(add_content_grouped)):
    add_content_grouped.loc[i,'raw content'] = ''
    for (a,b) in zip(add_content_grouped.loc[i,'title'],add_content_grouped.loc[i,'content']):
        add_content_grouped.loc[i,'raw content'] += ' '+a + ' ' + b
add_content_grouped = add_content_grouped[['article_id','raw content']]    

add_content_grouped

Unnamed: 0,article_id,raw content
0,7,"Number of accidents In 2018, there were 3.1 m..."
1,13,Developments for GDP in the EU-27: growth sin...
2,16,Fall in the number of railway accidents 9 % f...
3,17,Downturn for EU transport performance in 2019...
4,18,Rail passenger transport performance continue...
...,...,...
860,10456,Problem After successfully identifying and jo...
861,10470,"Problem In France, there was significant room..."
862,10506,General overview Nine PEEIs concern short-ter...
863,10531,What are administrative sources? The term 'ad...


### Merge raw content of SE articles with main file

* Also, add title to definition.

In [8]:
SE_df = pd.merge(SE_df,add_content_grouped,left_on='id',right_on='article_id',how='inner')
SE_df.drop(['article_id'],axis=1,inplace=True)

SE_df['raw content'] = SE_df['title'] +'. '+SE_df['raw content']

SE_df.head(5)

Unnamed: 0,id,title,raw content
0,7,Accidents at work statistics,Accidents at work statistics. Number of accid...
1,13,National accounts and GDP,National accounts and GDP. Developments for G...
2,16,Railway safety statistics in the EU,Railway safety statistics in the EU. Fall in ...
3,17,Railway freight transport statistics,Railway freight transport statistics. Downtur...
4,18,Railway passenger transport statistics - quart...,Railway passenger transport statistics - quart...


### Lemmatize 'raw content'

* NLTK seems to be better than Spacy in lemmatization. Convert to lower-case before.

In [9]:
import nltk

w_tokenizer = nltk.tokenize.WhitespaceTokenizer()
lemmatizer = nltk.stem.WordNetLemmatizer()

def lemmatize_text(text):
    return [lemmatizer.lemmatize(w) for w in w_tokenizer.tokenize(text)]

SE_df['raw content'] = SE_df['raw content'].apply(lambda x: x.lower())
SE_df['raw content']= SE_df['raw content'].apply(lemmatize_text)
SE_df['raw content']= [' '.join(map(str, l)) for l in SE_df['raw content']]
SE_df['raw content'] = SE_df['raw content'].apply(lambda x: x.upper())
SE_df


Unnamed: 0,id,title,raw content
0,7,Accidents at work statistics,ACCIDENT AT WORK STATISTICS. NUMBER OF ACCIDEN...
1,13,National accounts and GDP,NATIONAL ACCOUNT AND GDP. DEVELOPMENT FOR GDP ...
2,16,Railway safety statistics in the EU,RAILWAY SAFETY STATISTIC IN THE EU. FALL IN TH...
3,17,Railway freight transport statistics,RAILWAY FREIGHT TRANSPORT STATISTICS. DOWNTURN...
4,18,Railway passenger transport statistics - quart...,RAILWAY PASSENGER TRANSPORT STATISTIC - QUARTE...
...,...,...,...
860,10456,"Merging statistics and geospatial information,...","MERGING STATISTIC AND GEOSPATIAL INFORMATION, ..."
861,10470,"Merging statistics and geospatial information,...","MERGING STATISTIC AND GEOSPATIAL INFORMATION, ..."
862,10506,Methods for compiling PEEIs in short-term busi...,METHOD FOR COMPILING PEEIS IN SHORT-TERM BUSIN...
863,10531,Building the System of National Accounts - adm...,BUILDING THE SYSTEM OF NATIONAL ACCOUNT - ADMI...


### OECD - Glossary of Statistical Terms
https://stats.oecd.org/glossary/alpha.asp

* Scrape terms and lemmatize.

In [10]:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re

url = "https://stats.oecd.org/glossary/alpha.asp"
html = urlopen(url)
soup = BeautifulSoup(html, 'lxml')
text = soup.get_text()
    
rows = soup.find_all('tr')
str_cells = str(rows)
cleantext = BeautifulSoup(str_cells, "lxml").get_text()
#print(cleantext)

list_rows = []
for row in rows:
    cells = row.find_all('a')
    str_cells = str(cells)
    clean = re.compile('<.*?>')
    clean2 = (re.sub(clean, '',str_cells))
    list_rows.append(clean2)
#print(clean2)
#type(clean2)

df = pd.DataFrame(list_rows)
df.head(10)
df[0]=df[0].apply(lambda x: re.sub(r'\[',' ',x))
df[0]=df[0].apply(lambda x: re.sub(r'\]',' ',x))
df1 = df[0].str.split(',', expand=True)
df_t = df1.T
df_t=df_t[[22]]
df_t = df_t.rename(columns={22: 'term'})
nan_value = float("NaN")

df_t.replace(" ", nan_value, inplace=True)

df_t.dropna(subset = ["term"], inplace=True)
df_t.replace(" ", nan_value, inplace=True)
df_t.insert(0, 'id', range(len(df_t)))
df_t.reset_index(inplace=True)
df_t.drop(columns=['index'],inplace=True)
df_t.head()

df_t['lemmatized_term']= df_t['term'].apply(lambda x: x.lower())
df_t['lemmatized_term']= df_t['lemmatized_term'].apply(lemmatize_text)
df_t['lemmatized_term']= [' '.join(map(str, l)) for l in df_t['lemmatized_term']]
df_t['lemmatized_term']= df_t['lemmatized_term'].apply(lambda x: x.upper())
df_t

Unnamed: 0,id,term,lemmatized_term
0,0,A posteriori audit,A POSTERIORI AUDIT
1,1,A priori audit,A PRIORI AUDIT
2,2,A programme language (APL),A PROGRAMME LANGUAGE (APL)
3,3,Abatement,ABATEMENT
4,4,Abatement cost,ABATEMENT COST
...,...,...,...
7074,7074,Zero-coupon / deep discount bond,ZERO-COUPON / DEEP DISCOUNT BOND
7075,7075,Zero-coupon bonds,ZERO-COUPON BOND
7076,7076,Zones,ZONE
7077,7077,Zoning,ZONING


### Prepare Spacy's PhraseMatcher by building a custom vocabulary from OECD's Glossary (from column 'lemmatized_term')

In [11]:
from spacy.matcher import PhraseMatcher

matcher = PhraseMatcher(nlp.vocab)
terms = df_t['lemmatized_term'].values.tolist()
# Only run nlp.make_doc to speed things up
patterns = [nlp.make_doc(text) for text in terms]
matcher.add("TerminologyList", patterns)

### Apply PhraseMatcher

* Collect results per SE article ('doc_id') in a dataframe 'res'. Ignore matches with 2 words or less.
* Depending on length of match: columns '3-Phrases', '4-Phrases', '5-and-above-Phrases'. These will contain dictionaries with the matched lemmatized terms and their counts, in descending order of counts.
* Column 'Terms' has a dictionary with the corresponding **original terms** in OECD's Glossary and their counts in the matches.

In [12]:
res = pd.DataFrame(index=range(len(SE_df)))
res['3-Phrases']=[[] for i in range(len(SE_df))]
res['4-Phrases']=[[] for i in range(len(SE_df))]
res['5-and-above-Phrases']=[[] for i in range(len(SE_df))]
res['Terms']=[dict() for i in range(len(SE_df))]
docs=nlp.pipe(SE_df['raw content'])
for (i,doc) in enumerate(docs):
    print(i)
    for sent in doc.sents:
        matches = matcher(sent)
        for match_id, start, end in matches:
            span = doc[start:end]
            n_words = len(span.text.split(' '))
            if n_words >= 3:
                doc_id = SE_df.loc[i,'id']
                idx = df_t.index[df_t['lemmatized_term'].str.contains(span.text,regex=False)].tolist()
                print(i,SE_df.loc[i,'title'],len(sent.text),'>',n_words,span.text,idx)
                res.loc[i,'doc_id']=doc_id
                for elem in df_t.loc[idx,'term'].values.tolist():
                    if elem in res.loc[i,'Terms'].keys():
                        res.loc[i,'Terms'][elem] +=1
                    else:
                        res.loc[i,'Terms'][elem] =1
                #res.loc[i,'Terms'].append(concepts_df.loc[idx,'term'].values.tolist())
                if n_words == 3:
                    res.loc[i,'3-Phrases'].append(span.text)
                elif n_words == 4:
                    res.loc[i,'4-Phrases'].append(span.text)
                else:
                    res.loc[i,'5-and-above-Phrases'].append(span.text)
                    
                    


0
0 Accidents at work statistics 324 > 4 HEALTH AND SOCIAL WORK [1988, 3114]
1
1 National accounts and GDP 190 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
1 National accounts and GDP 142 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
1 National accounts and GDP 254 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
1 National accounts and GDP 206 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
1 National accounts and GDP 143 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
1 National accounts and GDP 231 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
1 National accounts and GDP 206 > 3 ANNUAL GROWTH RATE [242, 248]
1 National accounts and GDP 206 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
1 National accounts and GDP 314 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
1 National accounts and GDP 173 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
1 National accounts and GDP 162 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
1 National accounts and GDP 162 > 3 PURCHASING POWER STANDARD [5219]
1 National accounts and GDP 175 > 3 

33
33 International trade in goods 148 > 3 ANNUAL GROWTH RATE [242, 248]
33 International trade in goods 169 > 3 ANNUAL GROWTH RATE [242, 248]
34
34 Material flow accounts and resource productivity 128 > 4 GROSS DOMESTIC PRODUCT (GDP) [2812, 2813, 2814, 2815, 2816, 2817, 2818, 2819, 4298, 4957]
34 Material flow accounts and resource productivity 128 > 4 DOMESTIC MATERIAL CONSUMPTION (DMC) [1713]
34 Material flow accounts and resource productivity 219 > 3 GROSS CAPITAL FORMATION [2806]
35
35 Digital economy and society statistics - enterprises 212 > 4 ELECTRONIC DATA INTERCHANGE (EDI) [1932]
36
37
37 Healthy life years statistics 133 > 3 QUALITY OF LIFE [5263]
38
39
40
40 Children at risk of poverty or social exclusion 785 > 3 TYPE OF HOUSEHOLD [6662]
40 Children at risk of poverty or social exclusion 785 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
40 Children at risk of poverty or social exclusion 785 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
40 Children at risk o

57 Structure of government debt 233 > 3 GOVERNMENT GROSS DEBT [2646, 2651, 2771]
57 Structure of government debt 304 > 3 GOVERNMENT GROSS DEBT [2646, 2651, 2771]
57 Structure of government debt 178 > 3 GOVERNMENT GROSS DEBT [2646, 2651, 2771]
57 Structure of government debt 245 > 3 GOVERNMENT GROSS DEBT [2646, 2651, 2771]
57 Structure of government debt 223 > 3 GOVERNMENT GROSS DEBT [2646, 2651, 2771]
57 Structure of government debt 175 > 3 GOVERNMENT GROSS DEBT [2646, 2651, 2771]
57 Structure of government debt 177 > 3 GOVERNMENT GROSS DEBT [2646, 2651, 2771]
57 Structure of government debt 179 > 3 GOVERNMENT GROSS DEBT [2646, 2651, 2771]
58
59
59 Healthcare resource statistics - beds 189 > 3 PSYCHIATRIC CARE BED [5185]
59 Healthcare resource statistics - beds 116 > 3 PSYCHIATRIC CARE BED [5185]
59 Healthcare resource statistics - beds 428 > 3 PSYCHIATRIC CARE BED [5185]
59 Healthcare resource statistics - beds 267 > 3 PSYCHIATRIC CARE BED [5185]
59 Healthcare resource statistics - be

98 Educational attainment statistics 149 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
98 Educational attainment statistics 232 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
99
99 Statistics on young people neither in employment nor in education or training 167 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
99 Statistics on young people neither in employment nor in education or training 238 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
99 Statistics on young people neither in employment nor in education or training 279 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
99 Statistics on young people neither in employment nor in education or training 279 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
99 Statistics on young people neither in employment nor in education or training 279 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
99 Statistics on young people neither in employment nor in education or training 255 > 3 LEVEL OF EDUCATION [362

103 Tax revenue statistics 148 > 3 ACTUAL SOCIAL CONTRIBUTION [76, 1969]
104
104 Tourism statistics 396 > 3 BALANCE OF PAYMENT [382, 384, 385, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 959, 1317, 6403, 6407]
105
105 Chapter 5 259 > 3 PLACE OF RESIDENCE [4896]
105 Chapter 5 261 > 3 PLACE OF RESIDENCE [4896]
106
106 Wages and labour costs 126 > 3 PURCHASING POWER STANDARD [5219]
107
107 Overweight and obesity - BMI statistics 193 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
107 Overweight and obesity - BMI statistics 305 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
108
109
110
110 Tourism industries - economic analysis 196 > 5 VALUE ADDED AT FACTOR COST [2841, 6828]
110 Tourism industries - economic analysis 187 > 5 VALUE ADDED AT FACTOR COST [2841, 6828]
110 Tourism industries - economic analysis 194 > 5 VALUE ADDED AT FACTOR COST [2841, 6828]
111
111 Healthcare expenditure statistics 187 > 4 GROSS DOMESTIC PRODUCT (GDP) [2812, 2813, 2814, 2815, 2816, 2817

134 Construction production (volume) index overview 86 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
135
135 Industrial production (volume) index overview 184 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
135 Industrial production (volume) index overview 227 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
136
136 Industrial producer price index overview 41 > 3 PRODUCER PRICE INDEX [3240, 4633, 5104, 6830]
136 Industrial producer price index overview 157 > 3 PRODUCER PRICE INDEX [3240, 4633, 5104, 6830]
136 Industrial producer price index overview 157 > 3 PRODUCER PRICE INDEX [3240, 4633, 5104, 6830]
136 Industrial producer price index overview 72 > 3 PRODUCER PRICE INDEX [3240, 4633, 5104, 6830]
136 Industrial producer price index overview 105 > 3 CONSUMER PRICE INDEX [1124, 1125, 2873]
136 Industrial producer price index overview 77 > 3 SUBSIDY ON PRODUCT [4611, 4612, 6216, 6218, 6219]
136 Industrial producer price index overview 181 > 3 PRODUCER PRICE INDEX [3240, 4633, 5104, 6830]
136 Industri

169 Building the System of National Accounts - basic concepts 130 > 4 CENTRAL PRODUCT CLASSIFICATION (CPC) [772]
169 Building the System of National Accounts - basic concepts 221 > 5 STANDARD INTERNATIONAL TRADE CLASSIFICATION (SITC) [6066]
169 Building the System of National Accounts - basic concepts 178 > 4 CONSUMPTION OF FIXED CAPITAL [1133]
169 Building the System of National Accounts - basic concepts 236 > 4 CONSUMPTION OF FIXED CAPITAL [1133]
169 Building the System of National Accounts - basic concepts 161 > 3 OTHER ACCUMULATION ENTRY [4575]
169 Building the System of National Accounts - basic concepts 648 > 7 CLASSIFICATION OF INDIVIDUAL CONSUMPTION BY PURPOSE (COICOP) [854]
169 Building the System of National Accounts - basic concepts 648 > 4 HOUSEHOLD FINAL CONSUMPTION EXPENDITURE [2982]
169 Building the System of National Accounts - basic concepts 648 > 7 CLASSIFICATION OF THE FUNCTION OF GOVERNMENT (COFOG) [856]
169 Building the System of National Accounts - basic concepts 

178 HICP methodology 109 > 4 EUROPEAN ECONOMIC AREA (EEA) [2169]
178 HICP methodology 120 > 3 EUROPEAN CENTRAL BANK [2162]
178 HICP methodology 289 > 3 EUROPEAN CENTRAL BANK [2162]
178 HICP methodology 288 > 3 CONSUMER PRICE INDEX [1124, 1125, 2873]
178 HICP methodology 153 > 3 PURE PRICE CHANGE [5222]
178 HICP methodology 153 > 3 PURE PRICE INDEX [5223]
178 HICP methodology 65 > 4 HOUSEHOLD FINAL CONSUMPTION EXPENDITURE [2982]
178 HICP methodology 114 > 4 HOUSEHOLD FINAL CONSUMPTION EXPENDITURE [2982]
178 HICP methodology 109 > 4 HOUSEHOLD FINAL CONSUMPTION EXPENDITURE [2982]
178 HICP methodology 317 > 3 PRICE REFERENCE PERIOD [5023]
178 HICP methodology 317 > 3 PRICE REFERENCE PERIOD [5023]
178 HICP methodology 317 > 3 PRICE REFERENCE PERIOD [5023]
178 HICP methodology 215 > 3 PRICE REFERENCE PERIOD [5023]
178 HICP methodology 107 > 3 PRICE REFERENCE PERIOD [5023]
178 HICP methodology 107 > 3 PRICE REFERENCE PERIOD [5023]
178 HICP methodology 387 > 3 PRICE REFERENCE PERIOD [5023]
178

215
215 Living conditions in Europe - poverty and social exclusion 784 > 3 QUALITY OF LIFE [5263]
215 Living conditions in Europe - poverty and social exclusion 246 > 3 TYPE OF HOUSEHOLD [6662]
215 Living conditions in Europe - poverty and social exclusion 144 > 3 TYPE OF HOUSEHOLD [6662]
215 Living conditions in Europe - poverty and social exclusion 201 > 3 POPULATION AT RISK [4929]
215 Living conditions in Europe - poverty and social exclusion 186 > 3 QUALITY OF LIFE [5263]
215 Living conditions in Europe - poverty and social exclusion 353 > 3 PURCHASING POWER STANDARD [5219]
215 Living conditions in Europe - poverty and social exclusion 186 > 3 QUALITY OF LIFE [5263]
215 Living conditions in Europe - poverty and social exclusion 353 > 3 PURCHASING POWER STANDARD [5219]
216
217
218
219
220
220 Accidents and injuries statistics 219 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
220 Accidents and injuries statistics 219 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
220 Accidents and injuries st

240 From farm to fork ? a statistical journey 198 > 9 INTERNATIONAL COUNCIL FOR THE EXPLORATION OF THE SEA (ICES) [3343]
240 From farm to fork ? a statistical journey 271 > 4 COMMON AGRICULTURAL POLICY (CAP) [965, 966]
240 From farm to fork ? a statistical journey 68 > 3 UNIT OF MEASURE [3417, 6733, 6734]
240 From farm to fork ? a statistical journey 162 > 3 INTERNATIONAL SEA TRANSPORT [3365]
240 From farm to fork ? a statistical journey 44 > 3 EXPENDITURE ON FOOD [2239]
241
242
242 Ageing Europe - statistics on health and disability 244 > 3 DISABILITY-FREE LIFE EXPECTANCY [1642]
242 Ageing Europe - statistics on health and disability 131 > 3 QUALITY OF LIFE [5263]
242 Ageing Europe - statistics on health and disability 61 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
242 Ageing Europe - statistics on health and disability 61 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
242 Ageing Europe - statistics on health and disability 230 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
242 Ageing Europe - st

265 Being young in Europe today - living conditions for children 192 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
265 Being young in Europe today - living conditions for children 194 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
265 Being young in Europe today - living conditions for children 194 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
265 Being young in Europe today - living conditions for children 344 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
265 Being young in Europe today - living conditions for children 330 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
265 Being young in Europe today - living conditions for children 343 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
265 Being young in Europe today - living conditions for children 315 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
265 Being young in Europe today - living conditions for children 315 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
265 Being young 

272 Building the System of National Accounts - supply and use tables 216 > 3 BALANCE OF PAYMENT [382, 384, 385, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 959, 1317, 6403, 6407]
272 Building the System of National Accounts - supply and use tables 261 > 4 HOUSEHOLD FINAL CONSUMPTION EXPENDITURE [2982]
272 Building the System of National Accounts - supply and use tables 261 > 4 GOVERNMENT FINAL CONSUMPTION EXPENDITURE [2769]
272 Building the System of National Accounts - supply and use tables 261 > 3 GROSS CAPITAL FORMATION [2806]
272 Building the System of National Accounts - supply and use tables 59 > 4 SUPPLY AND USE TABLE [3005, 4860, 6254, 6255, 6256, 6257]
272 Building the System of National Accounts - supply and use tables 73 > 3 GROSS VALUE ADDED [2839, 2840, 2841, 2842, 2843]
272 Building the System of National Accounts - supply and use tables 94 > 3 GROSS CAPITAL FORMATION [2806]
272 Building the System of National Accounts - supply and use tables 48 > 3 GROSS VALUE

272 Building the System of National Accounts - supply and use tables 80 > 4 SUPPLY AND USE TABLE [3005, 4860, 6254, 6255, 6256, 6257]
272 Building the System of National Accounts - supply and use tables 603 > 4 SUPPLY AND USE TABLE [3005, 4860, 6254, 6255, 6256, 6257]
272 Building the System of National Accounts - supply and use tables 283 > 4 SUPPLY AND USE TABLE [3005, 4860, 6254, 6255, 6256, 6257]
272 Building the System of National Accounts - supply and use tables 325 > 4 SUPPLY AND USE TABLE [3005, 4860, 6254, 6255, 6256, 6257]
272 Building the System of National Accounts - supply and use tables 151 > 4 SUPPLY AND USE TABLE [3005, 4860, 6254, 6255, 6256, 6257]
272 Building the System of National Accounts - supply and use tables 116 > 4 SUPPLY AND USE TABLE [3005, 4860, 6254, 6255, 6256, 6257]
272 Building the System of National Accounts - supply and use tables 158 > 4 SUPPLY AND USE TABLE [3005, 4860, 6254, 6255, 6256, 6257]
272 Building the System of National Accounts - supply an

302 The EU in the world - labour market 136 > 3 ECONOMICALLY ACTIVE PERSON [1850]
302 The EU in the world - labour market 67 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
302 The EU in the world - labour market 32 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
302 The EU in the world - labour market 202 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
302 The EU in the world - labour market 115 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
302 The EU in the world - labour market 151 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
302 The EU in the world - labour market 179 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
302 The EU in the world - labour market 119 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
302 The EU in the world - labour market 128 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
303
303 Employment in sport 303 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
304
304 Manufacturing_of_sporting_goods 131 > 3 RA

337 European Neighbourhood Policy - East - indicators for sustainable development goals 421 > 3 INFANT MORTALITY RATE [3182]
337 European Neighbourhood Policy - East - indicators for sustainable development goals 157 > 3 INFANT MORTALITY RATE [3182]
337 European Neighbourhood Policy - East - indicators for sustainable development goals 122 > 3 INFANT MORTALITY RATE [3182]
337 European Neighbourhood Policy - East - indicators for sustainable development goals 128 > 4 GROSS DOMESTIC PRODUCT (GDP) [2812, 2813, 2814, 2815, 2816, 2817, 2818, 2819, 4298, 4957]
337 European Neighbourhood Policy - East - indicators for sustainable development goals 249 > 3 SUSTAINABLE ECONOMIC GROWTH [6289]
337 European Neighbourhood Policy - East - indicators for sustainable development goals 249 > 3 SUSTAINABLE ECONOMIC GROWTH [6289]
337 European Neighbourhood Policy - East - indicators for sustainable development goals 138 > 3 ANNUAL GROWTH RATE [242, 248]
337 European Neighbourhood Policy - East - indicato

354 European Neighbourhood Policy - East - statistics on science, technology and digital society 279 > 4 RESEARCH AND DEVELOPMENT PERSONNEL [5543]
354 European Neighbourhood Policy - East - statistics on science, technology and digital society 279 > 4 RESEARCH AND DEVELOPMENT PERSONNEL [5543]
355
355 European Neighbourhood Policy - East - economic statistics 134 > 4 GROSS DOMESTIC PRODUCT (GDP) [2812, 2813, 2814, 2815, 2816, 2817, 2818, 2819, 4298, 4957]
355 European Neighbourhood Policy - East - economic statistics 134 > 4 GROSS DOMESTIC PRODUCT (GDP) [2812, 2813, 2814, 2815, 2816, 2817, 2818, 2819, 4298, 4957]
355 European Neighbourhood Policy - East - economic statistics 82 > 3 ANNUAL GROWTH RATE [242, 248]
355 European Neighbourhood Policy - East - economic statistics 342 > 3 PURCHASING POWER STANDARD [5219]
355 European Neighbourhood Policy - East - economic statistics 342 > 3 UNIT OF MEASURE [3417, 6733, 6734]
355 European Neighbourhood Policy - East - economic statistics 91 > 3 

372 First and second-generation immigrants - statistics on households 237 > 3 TYPE OF HOUSEHOLD [6662]
372 First and second-generation immigrants - statistics on households 237 > 3 TYPE OF HOUSEHOLD [6662]
372 First and second-generation immigrants - statistics on households 436 > 3 TYPE OF HOUSEHOLD [6662]
372 First and second-generation immigrants - statistics on households 146 > 3 TYPE OF HOUSEHOLD [6662]
372 First and second-generation immigrants - statistics on households 59 > 3 TYPE OF HOUSEHOLD [6662]
372 First and second-generation immigrants - statistics on households 143 > 3 TYPE OF HOUSEHOLD [6662]
372 First and second-generation immigrants - statistics on households 229 > 3 TYPE OF HOUSEHOLD [6662]
372 First and second-generation immigrants - statistics on households 204 > 3 TYPE OF HOUSEHOLD [6662]
372 First and second-generation immigrants - statistics on households 213 > 3 TYPE OF HOUSEHOLD [6662]
372 First and second-generation immigrants - statistics on households 69 >

396 Ageing Europe - statistics on pensions, income and expenditure 194 > 3 QUALITY OF LIFE [5263]
396 Ageing Europe - statistics on pensions, income and expenditure 238 > 3 HOUSEHOLD BUDGET SURVEY [2979]
396 Ageing Europe - statistics on pensions, income and expenditure 123 > 4 INCOME ELASTICITY OF DEMAND [3109]
396 Ageing Europe - statistics on pensions, income and expenditure 123 > 3 ELASTICITY OF DEMAND [1277, 1922, 3109]
396 Ageing Europe - statistics on pensions, income and expenditure 120 > 4 TOTAL EXPENDITURE ON HEALTH [6499]
396 Ageing Europe - statistics on pensions, income and expenditure 237 > 3 HOUSEHOLD BUDGET SURVEY [2979]
396 Ageing Europe - statistics on pensions, income and expenditure 123 > 4 INCOME ELASTICITY OF DEMAND [3109]
396 Ageing Europe - statistics on pensions, income and expenditure 123 > 3 ELASTICITY OF DEMAND [1277, 1922, 3109]
397
397 Interaction of household income, consumption and wealth - methodological issues 158 > 3 HOUSEHOLD BUDGET SURVEY [2979]
397

420 Interaction of household income, consumption and wealth ' statistics on taxation 54 > 3 TAX ON INCOME [1333, 1637, 4613, 6369]
421
421 International trade in goods - tariffs 246 > 4 WORLD TRADE ORGANISATION (WTO) [7048]
421 International trade in goods - tariffs 84 > 7 GENERAL AGREEMENT ON TARIFF AND TRADE (GATT) [2640]
421 International trade in goods - tariffs 184 > 5 GENERALISED SYSTEM OF PREFERENCE (GSP) [2658]
421 International trade in goods - tariffs 312 > 3 FREE TRADE AREA [297, 2569]
421 International trade in goods - tariffs 286 > 3 DOHA DEVELOPMENT AGENDA [1707]
421 International trade in goods - tariffs 448 > 4 INTERNATIONAL TRADE CENTRE (ITC) [3379]
421 International trade in goods - tariffs 448 > 3 RULE OF ORIGIN [5677]
422
422 International investment position statistics 45 > 3 INTERNATIONAL INVESTMENT POSITION [3350, 4224]
422 International investment position statistics 248 > 4 NET INTERNATIONAL INVESTMENT POSITION [4224]
422 International investment position stati

442 Material flow accounts statistics - material footprints 138 > 3 GROSS CAPITAL FORMATION [2806]
442 Material flow accounts statistics - material footprints 81 > 3 GROSS CAPITAL FORMATION [2806]
443
443 Resource productivity statistics 172 > 4 GROSS DOMESTIC PRODUCT (GDP) [2812, 2813, 2814, 2815, 2816, 2817, 2818, 2819, 4298, 4957]
443 Resource productivity statistics 172 > 4 DOMESTIC MATERIAL CONSUMPTION (DMC) [1713]
443 Resource productivity statistics 264 > 3 PURCHASING POWER STANDARD [5219]
443 Resource productivity statistics 219 > 3 GROSS CAPITAL FORMATION [2806]
444
445
445 Mental health and related issues statistics 130 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
445 Mental health and related issues statistics 200 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
445 Mental health and related issues statistics 213 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
445 Mental health and related issues statistics 195 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
445 Mental health and related issues s

465 Main goods in extra-EU exports 49 > 4 OTHER MACHINERY AND EQUIPMENT [4601]
466
466 Migrant integration statistics - over-qualification 169 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
467
467 Migrant integration statistics ? regional labour market indicators 370 > 3 ECONOMICALLY ACTIVE PERSON [1850]
467 Migrant integration statistics ? regional labour market indicators 308 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
467 Migrant integration statistics ? regional labour market indicators 283 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
468
468 Unemployment statistics and beyond 196 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
468 Unemployment statistics and beyond 292 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
468 Unemployment statistics and beyond 292 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
468 Unemployment statistics and beyond 267 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
469
469 Households ? statistics on d

477 Production and international trade in high-tech products 207 > 3 ANNUAL GROWTH RATE [242, 248]
477 Production and international trade in high-tech products 123 > 3 ANNUAL GROWTH RATE [242, 248]
478
478 Production of lignite in the Western Balkans - statistics 81 > 3 CARBON DIOXIDE EMISSION [699]
478 Production of lignite in the Western Balkans - statistics 249 > 3 CARBON DIOXIDE EMISSION [699]
479
479 Production of lignite in the EU - statistics 228 > 3 GAS WORK GAS [2620]
479 Production of lignite in the EU - statistics 228 > 3 COKE OVEN COKE [919]
479 Production of lignite in the EU - statistics 81 > 3 CARBON DIOXIDE EMISSION [699]
480
481
482
482 Quality of life indicators - leisure 36 > 3 QUALITY OF LIFE [5263]
482 Quality of life indicators - leisure 55 > 3 QUALITY OF LIFE [5263]
482 Quality of life indicators - leisure 156 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
482 Quality of life indicators - leisure 181 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
48

512
512 Short-term business statistics - compiling indices at European level 236 > 3 HIGHER LEVEL INDEX [2929]
512 Short-term business statistics - compiling indices at European level 149 > 3 INDUSTRIAL PRODUCTION INDEX [3170]
512 Short-term business statistics - compiling indices at European level 119 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
512 Short-term business statistics - compiling indices at European level 117 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
512 Short-term business statistics - compiling indices at European level 174 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
513
514
514 Degree of urbanisation classification 164 > 7 ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT (OECD) [4562]
515
515 EU labour force survey ? main features and legal basis 269 > 4 INTERNATIONAL LABOUR ORGANISATION (ILO) [3351]
515 EU labour force survey ? main features and legal basis 63 > 6 INTERNATIONAL STANDARD CLASSIFICATION OF OCCUPATION (ISCO) [3369]
516
516 Preventive services 374 > 3 

557 Transportation and storage statistics - NACE Rev. 2 100 > 3 APPARENT LABOUR PRODUCTIVITY [267]
557 Transportation and storage statistics - NACE Rev. 2 299 > 3 APPARENT LABOUR PRODUCTIVITY [267]
557 Transportation and storage statistics - NACE Rev. 2 354 > 3 APPARENT LABOUR PRODUCTIVITY [267]
557 Transportation and storage statistics - NACE Rev. 2 263 > 3 APPARENT LABOUR PRODUCTIVITY [267]
557 Transportation and storage statistics - NACE Rev. 2 198 > 3 APPARENT LABOUR PRODUCTIVITY [267]
558
559
560
561
561 Supply and use statistics 359 > 3 GROSS CAPITAL FORMATION [2806]
561 Supply and use statistics 263 > 4 HEALTH AND SOCIAL WORK [1988, 3114]
561 Supply and use statistics 46 > 3 COMPENSATION OF EMPLOYEE [988]
561 Supply and use statistics 350 > 4 HEALTH AND SOCIAL WORK [1988, 3114]
561 Supply and use statistics 106 > 3 GROSS VALUE ADDED [2839, 2840, 2841, 2842, 2843]
561 Supply and use statistics 303 > 3 GROSS CAPITAL FORMATION [2806]
561 Supply and use statistics 272 > 4 HEALTH AND

599 Regional household income statistics 124 > 3 PLACE OF WORK [2449, 4285, 4901]
599 Regional household income statistics 124 > 3 PLACE OF RESIDENCE [4896]
599 Regional household income statistics 201 > 3 COMPENSATION OF EMPLOYEE [988]
599 Regional household income statistics 201 > 3 NET OPERATING SURPLUS [4233]
599 Regional household income statistics 186 > 3 COMPENSATION OF EMPLOYEE [988]
599 Regional household income statistics 285 > 3 NET OPERATING SURPLUS [4233]
599 Regional household income statistics 95 > 4 CONSUMPTION OF FIXED CAPITAL [1133]
599 Regional household income statistics 276 > 3 RENT ON LAND [5498]
599 Regional household income statistics 208 > 3 COMPENSATION OF EMPLOYEE [988]
599 Regional household income statistics 105 > 3 COMPENSATION OF EMPLOYEE [988]
599 Regional household income statistics 418 > 3 NET OPERATING SURPLUS [4233]
599 Regional household income statistics 418 > 3 NET OPERATING SURPLUS [4233]
599 Regional household income statistics 137 > 3 COMPENSAT

620 International trade in services - an overview 179 > 4 INTERNATIONAL TRADE IN SERVICE [3380, 4016]
620 International trade in services - an overview 400 > 4 INTERNATIONAL TRADE IN SERVICE [3380, 4016]
620 International trade in services - an overview 208 > 3 ANNUAL GROWTH RATE [242, 248]
620 International trade in services - an overview 236 > 4 WORLD TRADE ORGANISATION (WTO) [7048]
620 International trade in services - an overview 147 > 7 GENERAL AGREEMENT ON TRADE IN SERVICE (GATS) [2641]
620 International trade in services - an overview 253 > 3 BARRIER TO ENTRY [431]
620 International trade in services - an overview 291 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
620 International trade in services - an overview 291 > 4 INTERNATIONAL TRADE IN SERVICE [3380, 4016]
620 International trade in services - an overview 288 > 3 OFFSHORE FINANCIAL CENTRE [4497]
621
622
623
624
625
625 International trade in goods by enterprise characteristic 89 > 5 SMALL AND MEDIUM-SIZED ENTERPRISE (SMES) [

651
651 Foreign language skills statistics 410 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
651 Foreign language skills statistics 161 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
651 Foreign language skills statistics 145 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
651 Foreign language skills statistics 216 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
651 Foreign language skills statistics 216 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
651 Foreign language skills statistics 170 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
651 Foreign language skills statistics 98 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
651 Foreign language skills statistics 191 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
651 Foreign language skills statistics 200 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
651 Foreign language skills statistics 217 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
651 Foreign language skills

668 Environment statistics at subnational level 151 > 3 OTHER WOODED LAND [4616]
668 Environment statistics at subnational level 123 > 3 OTHER WOODED LAND [4616]
668 Environment statistics at subnational level 199 > 3 OTHER WOODED LAND [4616]
668 Environment statistics at subnational level 199 > 3 OTHER WOODED LAND [4616]
668 Environment statistics at subnational level 164 > 3 OTHER WOODED LAND [4616]
668 Environment statistics at subnational level 138 > 3 OTHER WOODED LAND [4616]
668 Environment statistics at subnational level 108 > 3 CARBON DIOXIDE EMISSION [699]
669
669 Enlargement countries - industry and service statistics 102 > 3 INDUSTRIAL PRODUCTION INDEX [3170]
669 Enlargement countries - industry and service statistics 145 > 3 INDUSTRIAL PRODUCTION INDEX [3170]
669 Enlargement countries - industry and service statistics 279 > 3 PRODUCER PRICE INDEX [3240, 4633, 5104, 6830]
670
670 Enlargement countries - agriculture, forestry and fishing statistics 32 > 3 GROSS VALUE ADDED [2

690 Being young in Europe today - health 157 > 3 INFANT MORTALITY RATE [3182]
690 Being young in Europe today - health 164 > 3 INFANT MORTALITY RATE [3182]
690 Being young in Europe today - health 150 > 3 INFANT MORTALITY RATE [3182]
690 Being young in Europe today - health 233 > 3 INFANT MORTALITY RATE [3182]
690 Being young in Europe today - health 241 > 3 INFANT MORTALITY RATE [3182]
690 Being young in Europe today - health 178 > 3 INFANT MORTALITY RATE [3182]
690 Being young in Europe today - health 144 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
690 Being young in Europe today - health 144 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
690 Being young in Europe today - health 144 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
690 Being young in Europe today - health 319 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
690 Being young in Europe today - health 137 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
690 Being young in Europe today - health 152 > 3 CAUSE OF DEATH [743, 784, 1165, 6693]
690 Being 

693 Being young in Europe today - family and society 239 > 3 TYPE OF HOUSEHOLD [6662]
693 Being young in Europe today - family and society 135 > 3 TYPE OF HOUSEHOLD [6662]
693 Being young in Europe today - family and society 233 > 3 TYPE OF HOUSEHOLD [6662]
693 Being young in Europe today - family and society 172 > 3 TYPE OF HOUSEHOLD [6662]
694
695
696
696 Asia-Europe Meeting (ASEM) - a statistical portrait - economy and finance 147 > 3 RATE OF CHANGE [247, 5346, 5937, 6640]
696 Asia-Europe Meeting (ASEM) - a statistical portrait - economy and finance 222 > 4 GROSS NATIONAL INCOME (GNI) [2825]
696 Asia-Europe Meeting (ASEM) - a statistical portrait - economy and finance 211 > 3 GROSS CAPITAL FORMATION [2806]
696 Asia-Europe Meeting (ASEM) - a statistical portrait - economy and finance 211 > 3 GROSS CAPITAL FORMATION [2806]
696 Asia-Europe Meeting (ASEM) - a statistical portrait - economy and finance 192 > 3 GROSS CAPITAL FORMATION [2806]
696 Asia-Europe Meeting (ASEM) - a statistical 

723 Building the System of National Accounts - context 213 > 3 BALANCE OF PAYMENT [382, 384, 385, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 959, 1317, 6403, 6407]
723 Building the System of National Accounts - context 213 > 4 BALANCE OF PAYMENT MANUAL [382]
723 Building the System of National Accounts - context 187 > 3 FOREIGN DIRECT INVESTMENT [2506, 2507]
723 Building the System of National Accounts - context 826 > 3 OTHER ACCUMULATION ENTRY [4575]
723 Building the System of National Accounts - context 208 > 4 GROSS DOMESTIC PRODUCT (GDP) [2812, 2813, 2814, 2815, 2816, 2817, 2818, 2819, 4298, 4957]
723 Building the System of National Accounts - context 171 > 3 NET NATIONAL INCOME [4231]
723 Building the System of National Accounts - context 756 > 3 BALANCE OF PAYMENT [382, 384, 385, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 959, 1317, 6403, 6407]
723 Building the System of National Accounts - context 756 > 3 INTERNATIONAL INVESTMENT POSITION [3350, 4224]
723

731 City statistics ' education and training 477 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
731 City statistics ' education and training 262 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
732
733
733 Building the System of National Accounts - volume measures 272 > 4 CONSUMPTION OF FIXED CAPITAL [1133]
733 Building the System of National Accounts - volume measures 434 > 4 GOOD AND SERVICE ACCOUNT [2727]
733 Building the System of National Accounts - volume measures 103 > 3 PAASCHE PRICE INDEX [2693, 3716, 4682]
733 Building the System of National Accounts - volume measures 103 > 3 LASPEYRES VOLUME INDEX [3595]
733 Building the System of National Accounts - volume measures 244 > 3 LASPEYRES VOLUME INDEX [3595]
733 Building the System of National Accounts - volume measures 89 > 3 LASPEYRES PRICE INDEX [2690, 3593, 3715]
733 Building the System of National Accounts - volume measures 89 > 3 PAASCHE VOLUME INDEX [4684]
733 Building the System of National Accounts - volume m

753
754
754 Living conditions in Europe - introduction 190 > 4 GROSS DOMESTIC PRODUCT (GDP) [2812, 2813, 2814, 2815, 2816, 2817, 2818, 2819, 4298, 4957]
755
755 MEETS programme 444 > 3 FOREIGN DIRECT INVESTMENT [2506, 2507]
756
756 MEETS programme - framework for business-related statistics 176 > 5 LOCAL KIND OF ACTIVITY UNIT [3701]
756 MEETS programme - framework for business-related statistics 112 > 5 LOCAL KIND OF ACTIVITY UNIT [3701]
756 MEETS programme - framework for business-related statistics 381 > 3 FOREIGN DIRECT INVESTMENT [2506, 2507]
757
757 Merging statistics and geospatial information, 2012 projects - Hungary 52 > 4 NATIONAL STATISTICAL OFFICE (NSO) [4161]
758
758 Merging statistics and geospatial information, 2012 projects - Malta 251 > 4 NATIONAL STATISTICAL OFFICE (NSO) [4161]
759
759 Merging statistics and geospatial information, 2012 projects - Poland 358 > 3 PLACE OF WORK [2449, 4285, 4901]
759 Merging statistics and geospatial information, 2012 projects - Poland 3

789 Merging statistics and geospatial information, 2014 projects - Portugal 197 > 3 CONSUMER PRICE INDEX [1124, 1125, 2873]
790
790 Statistics in development cooperation - coordination 224 > 3 MILLENNIUM DEVELOPMENT GOAL [3973, 3974]
790 Statistics in development cooperation - coordination 224 > 3 MILLENNIUM DEVELOPMENT GOAL [3973, 3974]
790 Statistics in development cooperation - coordination 434 > 4 INTERNATIONAL DEVELOPMENT ASSOCIATION (IDA) [3344]
791
791 Statistics in development cooperation - data availability 422 > 3 CONSUMER PRICE INDEX [1124, 1125, 2873]
791 Statistics in development cooperation - data availability 422 > 3 HOUSEHOLD BUDGET SURVEY [2979]
791 Statistics in development cooperation - data availability 170 > 4 NATIONAL STATISTICAL SYSTEM (NSS) [4162]
791 Statistics in development cooperation - data availability 107 > 4 NATIONAL STATISTICAL INSTITUTE (NSI) [4160]
791 Statistics in development cooperation - data availability 39 > 3 RATE OF CHANGE [247, 5346, 5937, 66

807 Statistics in development cooperation - advocacy 385 > 3 MILLENNIUM DEVELOPMENT GOAL [3973, 3974]
808
808 Statistics in development cooperation - improving statistical capacity 294 > 3 MILLENNIUM DEVELOPMENT GOAL [3973, 3974]
808 Statistics in development cooperation - improving statistical capacity 344 > 4 NATIONAL STATISTICAL SYSTEM (NSS) [4162]
809
809 Statistics in development cooperation - EU support to partner countries 95 > 4 NATIONAL STATISTICAL SYSTEM (NSS) [4162]
809 Statistics in development cooperation - EU support to partner countries 252 > 3 SECTOR WIDE APPROACH [5802]
810
811
811 Statistics in development cooperation - development indicators 289 > 4 NATIONAL STATISTICAL SYSTEM (NSS) [4162]
812
813
813 Merging statistics and geospatial information, 2013 projects - Italy 377 > 3 PLACE OF RESIDENCE [4896]
813 Merging statistics and geospatial information, 2013 projects - Italy 84 > 3 LOSS OF INFORMATION [3745]
814
815
816
816 Merging statistics and geospatial informatio

853 Disability statistics background - Labour force survey - proxy analysis 242 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
854
854 Disability statistics background - Labour force survey - non-response analysis 189 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
854 Disability statistics background - Labour force survey - non-response analysis 303 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
854 Disability statistics background - Labour force survey - non-response analysis 194 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
854 Disability statistics background - Labour force survey - non-response analysis 286 > 3 LEVEL OF EDUCATION [3628, 4261, 4262, 4956, 5409]
855
855 Continuing Vocational Training Survey (CVTS) methodology 178 > 4 HEALTH AND SOCIAL WORK [1988, 3114]
856
856 City statistics ' economic aspects 168 > 4 INTERNATIONAL LABOUR ORGANISATION (ILO) [3351]
856 City statistics ' economic aspects 296 > 3 ECONOMICALLY ACTIVE POPULATION [1851]
856 Ci

In [13]:
res['3-Phrases']= res['3-Phrases'].apply(lambda x: dict(Counter(x).most_common()))
res['4-Phrases']= res['4-Phrases'].apply(lambda x: dict(Counter(x).most_common()))
res['5-and-above-Phrases']= res['5-and-above-Phrases'].apply(lambda x: dict(Counter(x).most_common()))



In [14]:
res=pd.merge(SE_df[['id','title']],res,left_on='id',right_on='doc_id',how='left')
res.drop(columns=['doc_id'],inplace=True)
res

Unnamed: 0,id,title,3-Phrases,4-Phrases,5-and-above-Phrases,Terms
0,7,Accidents at work statistics,{},{'HEALTH AND SOCIAL WORK': 1},{},"{' health and social work': 1, ' Incomes of he..."
1,13,National accounts and GDP,"{'RATE OF CHANGE': 15, 'GROSS VALUE ADDED': 6,...",{'HEALTH AND SOCIAL WORK': 3},"{'TAX ON PRODUCTION AND IMPORT': 2, 'EXTERNAL ...",{' Annualised growth rate (annualised rate of ...
2,16,Railway safety statistics in the EU,,,,
3,17,Railway freight transport statistics,,,,
4,18,Railway passenger transport statistics - quart...,,,,
...,...,...,...,...,...,...
860,10456,"Merging statistics and geospatial information,...",,,,
861,10470,"Merging statistics and geospatial information,...",,,,
862,10506,Methods for compiling PEEIs in short-term busi...,"{'PRODUCER PRICE INDEX': 3, 'CONSUMER PRICE IN...",{},{},"{' Input producer price indices': 3, ' Output ..."
863,10531,Building the System of National Accounts - adm...,"{'GROSS VALUE ADDED': 4, 'BALANCE OF PAYMENT':...","{'CONSUMPTION OF FIXED CAPITAL': 2, 'VALUE ADD...",{'FINANCIAL INTERMEDIATION SERVICE INDIRECTLY ...,"{' Deductible value added tax (VAT)': 1, ' Inv..."


In [15]:
outfile = file_name('Phrase_Matcher_SE_OECD','xlsx')
res.to_excel(outfile)
