The NACE Rev. 2 classification (Statistical Classification of Economic Activities in the European Community) is a standardized system for categorizing economic activities. It consists of 21 broad sections (identified by letters A–U), further divided into divisions, groups, and classes.

Here are the 21 broad sections of NACE Rev. 2:


A. Agriculture, Forestry and Fishing

    Crop and animal production, hunting and related service activities
    Forestry and logging
    Fishing and aquaculture

B. Mining and Quarrying

    Mining of coal and lignite
    Extraction of crude petroleum and natural gas
    Mining of metal ores
    Other mining and quarrying
    Mining support service activities

C. Manufacturing

    Manufacture of food products, beverages, and tobacco products
    Manufacture of textiles, clothing, leather, and related products
    Manufacture of wood and paper products
    Manufacture of chemicals, pharmaceuticals, rubber, and plastic products
    Manufacture of basic metals and fabricated metal products
    Manufacture of machinery, equipment, and transport vehicles
    Other manufacturing activities

D. Electricity, Gas, Steam, and Air Conditioning Supply

    Production and distribution of electricity
    Distribution of gaseous fuels
    Steam and air conditioning supply

E. Water Supply; Sewerage, Waste Management, and Remediation Activities

    Water collection, treatment, and supply
    Sewerage
    Waste collection, treatment, and disposal activities
    Remediation activities and other waste management services

F. Construction

    Construction of buildings
    Civil engineering
    Specialized construction activities

G. Wholesale and Retail Trade; Repair of Motor Vehicles and Motorcycles

    Wholesale and retail trade and repair of motor vehicles
    Wholesale trade (except motor vehicles)
    Retail trade (except motor vehicles)

H. Transportation and Storage

    Land transport and transport via pipelines
    Water transport
    Air transport
    Warehousing and support activities for transportation
    Postal and courier activities

I. Accommodation and Food Service Activities

    Accommodation
    Food and beverage service activities

J. Information and Communication

    Publishing activities
    Motion picture, video, and television production, sound recording, and music publishing
    Telecommunications
    Computer programming, consultancy, and related activities
    Information service activities

K. Financial and Insurance Activities

    Financial service activities
    Insurance, reinsurance, and pension funding
    Activities auxiliary to financial services and insurance

L. Real Estate Activities

    Real estate activities

M. Professional, Scientific, and Technical Activities

    Legal and accounting activities
    Management consultancy
    Architectural and engineering activities
    Scientific research and development
    Advertising and market research
    Other professional, scientific, and technical activities
    Veterinary activities

N. Administrative and Support Service Activities

    Rental and leasing activities
    Employment activities
    Travel agency, tour operator, and reservation services
    Security and investigation activities
    Services to buildings and landscape activities
    Office administrative and support activities

O. Public Administration and Defence; Compulsory Social Security

    Administration of the state, economic and social policy
    Defence activities
    Provision of services for the community

P. Education

    Education

Q. Human Health and Social Work Activities

    Human health activities
    Residential care activities
    Social work activities without accommodation

R. Arts, Entertainment, and Recreation

    Creative, arts, and entertainment activities
    Libraries, archives, museums, and other cultural activities
    Sports, amusement, and recreation activities

S. Other Service Activities

    Activities of membership organizations
    Repair of computers and personal goods
    Other personal service activities

T. Activities of Households as Employers; Undifferentiated Goods- and Services-Producing Activities of Households for Own Use

    Activities of households as employers
    Undifferentiated goods- and services-producing activities of households for own use

U. Activities of Extraterritorial Organizations and Bodies

    Activities of international organizations (e.g., the United Nations, embassies)

These sections form the broadest categories of the NACE Rev. 2 classification. They are further broken down into more detailed divisions, groups, and classes. Let me know if you'd like the divisions or deeper breakdowns!



In [1]:
%%javascript
// to avoid scroll in windows
IPython.OutputArea.prototype._should_scroll = function(lines) {
    return false;
}

<IPython.core.display.Javascript object>

In [2]:
import pandas as pd
import numpy as np

In [3]:
pd.set_option('display.max_columns', 1000)
pd.set_option('display.max_rows', 1000)

In [4]:
mini_naio = pd.read_pickle('mini_naio.xp')
sbs = pd.read_pickle('sbs.xp')
lc = pd.read_pickle('lc.xp')

mini_naio

Unnamed: 0,IND_USE (Labels),Total intermediate goods,Final consumption expenditure by government,Final consumption expenditure by households,Exports of goods and services,Changes in inventories and acquisition less disposals of valuables,Gross fixed capital formation,"Added value, gross",Compensation of employees
0,Total,16939701.18,3375443.51,7283949.99,3291159.74,360243.46,3320258.7,14303899.14,7447036.79
1,"Products of agriculture, hunting and related s...",381871.38,1756.62,142619.06,30647.16,10474.61,7444.62,244253.17,54367.01
2,"Products of forestry, logging and related serv...",42864.37,878.87,7533.2,2148.11,11085.96,456.39,29364.21,9874.05
3,Fish and other fishing products; aquaculture p...,10980.08,10.07,10712.56,575.75,-45.88,56.16,6506.32,2826.59
4,Mining and quarrying,337295.56,333.07,12425.73,10117.36,2681.28,1268.44,61205.82,23611.51
5,"Food, beverages and tobacco products",508405.01,1203.3,694013.1,154747.91,36172.89,0.55,284672.03,147133.21
6,"Textiles, wearing apparel, leather and related...",121877.73,729.71,159969.25,61689.96,29565.8,1603.87,70250.23,39609.14
7,"Wood and of products of wood and cork, except ...",143899.99,0.47,10975.47,21934.64,6849.82,9296.21,53407.64,28282.11
8,Paper and paper products,160065.88,165.9,30814.85,33893.65,-1069.85,0.52,55568.55,29110.21
9,Printing and recording services,62412.3,7.16,5172.42,1151.64,-706.8,5.73,28104.73,16477.14


In [5]:
mini_naio['Consumption good share'] = mini_naio['Final consumption expenditure by households']/\
    (mini_naio['Final consumption expenditure by households']+ mini_naio['Gross fixed capital formation']\
    +mini_naio['Total intermediate goods'])

mini_naio['Investment good share'] = mini_naio['Gross fixed capital formation']/\
    (mini_naio['Final consumption expenditure by households']+ mini_naio['Gross fixed capital formation']\
    +mini_naio['Total intermediate goods'])

mini_naio['Intermediate good share'] = mini_naio['Total intermediate goods']/\
    (mini_naio['Final consumption expenditure by households']+ mini_naio['Gross fixed capital formation']\
    +mini_naio['Total intermediate goods'])

pd.set_option('display.max_colwidth', 100)
mini_naio

Unnamed: 0,IND_USE (Labels),Total intermediate goods,Final consumption expenditure by government,Final consumption expenditure by households,Exports of goods and services,Changes in inventories and acquisition less disposals of valuables,Gross fixed capital formation,"Added value, gross",Compensation of employees,Consumption good share,Investment good share,Intermediate good share
0,Total,16939701.18,3375443.51,7283949.99,3291159.74,360243.46,3320258.7,14303899.14,7447036.79,0.264449,0.1205442,0.615007
1,"Products of agriculture, hunting and related services",381871.38,1756.62,142619.06,30647.16,10474.61,7444.62,244253.17,54367.01,0.268114,0.01399535,0.717891
2,"Products of forestry, logging and related services",42864.37,878.87,7533.2,2148.11,11085.96,456.39,29364.21,9874.05,0.148134,0.008974522,0.842891
3,Fish and other fishing products; aquaculture products; support services to fishing,10980.08,10.07,10712.56,575.75,-45.88,56.16,6506.32,2826.59,0.492559,0.002582211,0.504859
4,Mining and quarrying,337295.56,333.07,12425.73,10117.36,2681.28,1268.44,61205.82,23611.51,0.035402,0.003613895,0.960984
5,"Food, beverages and tobacco products",508405.01,1203.3,694013.1,154747.91,36172.89,0.55,284672.03,147133.21,0.577181,4.574114e-07,0.422819
6,"Textiles, wearing apparel, leather and related products",121877.73,729.71,159969.25,61689.96,29565.8,1603.87,70250.23,39609.14,0.564363,0.005658371,0.429978
7,"Wood and of products of wood and cork, except furniture; articles of straw and plaiting materials",143899.99,0.47,10975.47,21934.64,6849.82,9296.21,53407.64,28282.11,0.066854,0.05662493,0.876521
8,Paper and paper products,160065.88,165.9,30814.85,33893.65,-1069.85,0.52,55568.55,29110.21,0.161435,2.724207e-06,0.838563
9,Printing and recording services,62412.3,7.16,5172.42,1151.64,-706.8,5.73,28104.73,16477.14,0.076526,8.477529e-05,0.923389


In [6]:
(mini_naio['Consumption good share']+ mini_naio['Intermediate good share']+ mini_naio['Investment good share']).sum()

65.0

In [7]:
pd.set_option('display.max_colwidth', 100)
sbs

Unnamed: 0,SIZE_EMP (Labels),Total,From 0 to 1 person employed,From 0 to 9 persons employed,From 2 to 9 persons employed,From 10 to 19 persons employed,From 20 to 49 persons employed,From 50 to 249 persons employed,250 persons employed or more
0,Mining and quarrying,371000,,30000,,24400.0,38600,63848,213000.0
1,Manufacturing,30007527,,3720000,,2130000.0,3137995,6501120,14514113.0
2,"Electricity, gas, steam and air conditioning supply",1380000,,192263,,26554.0,48700,137668,977247.0
3,"Water supply; sewerage, waste management and remediation activities",1585225,,134000,,80000.0,150000,370000,
4,Construction,13814274,2292700.0,6463861,4171161.0,1989977.0,1869881,1756785,1733772.0
5,Wholesale and retail trade; repair of motor vehicles and motorcycles,29779934,2869697.0,10159638,7289941.0,2867238.0,3099214,4100000,9600000.0
6,Transportation and storage,10368577,795843.0,2132860,1337017.0,778285.0,1090577,1605824,4761030.0
7,Accommodation and food service activities,10888928,670000.0,4500000,3800000.0,1850000.0,1570000,1421984,1570000.0
8,Information and communication,7169884,839595.0,1694054,854459.0,458755.0,648890,1252840,3115346.0
9,Financial and insurance activities,4950777,,912551,,,177248,494777,


In [8]:
pd.set_option('display.max_colwidth', 100)
lc

Unnamed: 0,NACE_R2,lc
0,"Industry, construction and services (except activities of households as employers and extra-terr...",
1,"Industry, construction and services (except public administration, defense, compulsory social se...",31.8
2,Business economy,31.6
3,Industry and construction,
4,Industry (except construction),32.2
5,Mining and quarrying,31.8
6,Manufacturing,32.0
7,"Electricity, gas, steam and air conditioning supply",45.0
8,"Water supply; sewerage, waste management and remediation activities",26.2
9,Construction,28.5


In [9]:
column_names = ['#','NACE definition', 'type', 'dimensional class', 'Share of firms',\
                'Share of firms per sbs sector', 'sbs reference row'] 

# type = consumption / investment / intermediate good

# dimensional class gives the range of number of workers per class

# share of firms is computed by considering the weight of each subsector and each type (C-I-Int) on the total

# share of firms per sbs is computed by considering the weigth of each subsector in the sbs sector 
# considering compensation of employees and not considering type (agriculture is calculated aside)

ff = pd.DataFrame(columns=column_names)


workforce = sbs['Total'].sum()
agriculture = 19000000 # we assume all firms of agriculture having only one worker (reflects eurostat data ~ 10 MLN) 
                       # but we also consider further 9MLN of employees (cfr. ./agricoltura.txt)
workforce +=  agriculture
agricultureFirms = agriculture / 1.9
# we are not including PA and liberi professionisti

numberOfFirms = sbs['From 0 to 9 persons employed'].sum() / 5\
              + sbs['From 10 to 19 persons employed'].sum() / 15\
              + sbs['From 20 to 49 persons employed'].sum() / 35\
              + sbs['From 50 to 249 persons employed'].sum() / 150\
              + sbs['250 persons employed or more'].sum() / 1000\
              + agricultureFirms # to be placed after in 0-9 employed class
numberOfFirms = int(numberOfFirms)
numberOfFirms


20751071

In [10]:
agric_tot = (mini_naio['Compensation of employees'].loc[1] \
               + mini_naio['Compensation of employees'].loc[2]\
               + mini_naio['Compensation of employees'].loc[3])

agricultureSectorWeight = mini_naio['Compensation of employees'].loc[1]/ agric_tot
silvicultureSectorWeight = mini_naio['Compensation of employees'].loc[2]/ agric_tot
fishingSectorWeight = mini_naio['Compensation of employees'].loc[3]/ agric_tot

# agriculture
share = (agricultureFirms/numberOfFirms) * mini_naio['Compensation of employees'].loc[1]/ agric_tot \
                                    * mini_naio['Consumption good share'].loc[1]
ff.loc[0] = [1, "Agriculture", "C", "0-9", share, agricultureSectorWeight, np.nan]

share = (agricultureFirms/numberOfFirms) * mini_naio['Compensation of employees'].loc[1]/ agric_tot \
                                    * mini_naio['Investment good share'].loc[1]
ff.loc[1] = [1,"Agriculture", "I", "0-9", share, agricultureSectorWeight, np.nan]

share = (agricultureFirms/numberOfFirms) * mini_naio['Compensation of employees'].loc[1]/ agric_tot \
                                    * mini_naio['Intermediate good share'].loc[1]
ff.loc[2] = [1, "Agriculture", "Int", "0-9", share, agricultureSectorWeight, np.nan]

# silviculture
share = (agricultureFirms/numberOfFirms) * mini_naio['Compensation of employees'].loc[2]/ agric_tot \
                                    * mini_naio['Consumption good share'].loc[2]
ff.loc[3] = [2, "Silviculture", "C", "0-9", share, silvicultureSectorWeight, np.nan]

share = (agricultureFirms/numberOfFirms) * mini_naio['Compensation of employees'].loc[2]/ agric_tot \
                                    * mini_naio['Investment good share'].loc[2]
ff.loc[4] = [2, "Silviculture", "I", "0-9", share, silvicultureSectorWeight, np.nan]

share = (agricultureFirms/numberOfFirms) * mini_naio['Compensation of employees'].loc[2]/ agric_tot \
                                    * mini_naio['Intermediate good share'].loc[2]
ff.loc[5] = [2, "Silviculture", "Int", "0-9", share, silvicultureSectorWeight, np.nan]

# fishing (no investments, but with the row anyway)
share = (agricultureFirms/numberOfFirms) * mini_naio['Compensation of employees'].loc[3]/ agric_tot\
                                    * mini_naio['Consumption good share'].loc[3]              
ff.loc[6] = [3, "Fishing", "C", "0-9", share, fishingSectorWeight, np.nan]

share = (agricultureFirms/numberOfFirms) * mini_naio['Compensation of employees'].loc[3]/ agric_tot\
                                    * mini_naio['Investment good share'].loc[3]              
ff.loc[7] = [3, "Fishing", "I", "0-9", share, fishingSectorWeight, np.nan]

share = (agricultureFirms/numberOfFirms) * mini_naio['Compensation of employees'].loc[3]/ agric_tot\
                                    * mini_naio['Intermediate good share'].loc[3]               
ff.loc[8] = [3, "Fishing", "Int", "0-9", share, fishingSectorWeight, np.nan]

In the following cell, we compute  the share of firms per each row of the firm-feature file that we are generating (each row describes the specification of the NACE sector, the type of good it produces &mdash; consumption, investment, intermediate &mdash; and its dimensional class in terms of number of employees).
We calculate the share as the ratio between number of firms of the sector (estimated through the number of workers per dimensional class in standard cases, with some exception, e.g. agriculture) and the total number of firms; whereas we decompose consumptions, investments, and intermediate goods by using the Compensations of employees.

In [11]:
def shareCalculation(ffRow,miniNaioRow, miniNaioRange, naceDef,sbsRow):
    global numberOfFirms
    
    totalCompensationPerSector = sum(list(mini_naio['Compensation of employees'].loc[i] for i in miniNaioRange))
    sectorWeight = mini_naio['Compensation of employees'].loc[miniNaioRow] / totalCompensationPerSector
    
    

    share = sectorWeight * (sbs['From 0 to 9 persons employed'][sbsRow] / 5) / numberOfFirms * mini_naio['Consumption good share'].loc[miniNaioRow]
    ff.loc[ffRow] = [miniNaioRow, naceDef, "C", "0-9", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['From 10 to 19 persons employed'][sbsRow] / 15) / numberOfFirms * mini_naio['Consumption good share'].loc[miniNaioRow]
    ff.loc[ffRow+1] = [miniNaioRow, naceDef, "C", "10-19", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['From 20 to 49 persons employed'][sbsRow] / 35) / numberOfFirms * mini_naio['Consumption good share'].loc[miniNaioRow]
    ff.loc[ffRow+2] = [miniNaioRow, naceDef, "C", "20-49", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['From 50 to 249 persons employed'][sbsRow] / 150) / numberOfFirms * mini_naio['Consumption good share'].loc[miniNaioRow]
    ff.loc[ffRow+3] = [miniNaioRow, naceDef, "C", "50-249", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['250 persons employed or more'][sbsRow] / 1000) / numberOfFirms * mini_naio['Consumption good share'].loc[miniNaioRow]
    ff.loc[ffRow+4] = [miniNaioRow, naceDef, "C", ">=250", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['From 0 to 9 persons employed'][sbsRow] / 5) / numberOfFirms * mini_naio['Investment good share'].loc[miniNaioRow]
    ff.loc[ffRow+5] = [miniNaioRow, naceDef, "I", "0-9", share, sectorWeight, sbsRow]
    
    share = sectorWeight * (sbs['From 10 to 19 persons employed'][sbsRow] / 15) / numberOfFirms * mini_naio['Investment good share'].loc[miniNaioRow]
    ff.loc[ffRow+6] = [miniNaioRow, naceDef, "I", "10-19", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['From 20 to 49 persons employed'][sbsRow] / 35) / numberOfFirms * mini_naio['Investment good share'].loc[miniNaioRow]
    ff.loc[ffRow+7] = [miniNaioRow, naceDef, "I", "20-49", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['From 50 to 249 persons employed'][sbsRow] / 150) / numberOfFirms * mini_naio['Investment good share'].loc[miniNaioRow]
    ff.loc[ffRow+8] = [miniNaioRow, naceDef, "I", "50-249", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['250 persons employed or more'][sbsRow] / 1000) / numberOfFirms * mini_naio['Investment good share'].loc[miniNaioRow]
    ff.loc[ffRow+9] = [miniNaioRow, naceDef, "I", ">=250", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['From 0 to 9 persons employed'][sbsRow] / 5) / numberOfFirms * mini_naio['Intermediate good share'].loc[miniNaioRow]
    ff.loc[ffRow+10] = [miniNaioRow, naceDef, "Int", "0-9", share, sectorWeight, sbsRow]
    
    share = sectorWeight * (sbs['From 10 to 19 persons employed'][sbsRow] / 15) / numberOfFirms * mini_naio['Intermediate good share'].loc[miniNaioRow]
    ff.loc[ffRow+11] = [miniNaioRow, naceDef, "Int", "10-19", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['From 20 to 49 persons employed'][sbsRow] / 35) / numberOfFirms * mini_naio['Intermediate good share'].loc[miniNaioRow]
    ff.loc[ffRow+12] = [miniNaioRow, naceDef, "Int", "20-49", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['From 50 to 249 persons employed'][sbsRow] / 150) / numberOfFirms * mini_naio['Intermediate good share'].loc[miniNaioRow]
    ff.loc[ffRow+13] = [miniNaioRow, naceDef, "Int", "50-249", share, sectorWeight, sbsRow]

    share = sectorWeight * (sbs['250 persons employed or more'][sbsRow] / 1000) / numberOfFirms * mini_naio['Intermediate good share'].loc[miniNaioRow]
    ff.loc[ffRow+14] = [miniNaioRow, naceDef, "Int", ">=250", share, sectorWeight, sbsRow]
    

In [12]:
r=9

shareCalculation(r, 4, [4], mini_naio.iloc[4,0],0)
r=r+15

for i in range(5,24):
     shareCalculation(r + (i-5) * 15, i, range(5,24), mini_naio.iloc[i,0],1)
r=r+15*(24-5)

shareCalculation(r,24, [24],mini_naio.iloc[24,0],2)
r=r+15

for i in range(25,27):
     shareCalculation(r + (i-25) * 15, i, range(25,27), mini_naio.iloc[i,0],3)
r=r+15*(27-25)

shareCalculation(r,27, [27], mini_naio.iloc[27,0],4)
r=r+15

for i in range(28,31):
     shareCalculation(r + (i-28) * 15, i,range(28,31),mini_naio.iloc[i,0],5)
r=r+15*(31-28)

for i in range(31,36):
     shareCalculation(r + (i-31) * 15, i, range(31,36),mini_naio.iloc[i,0],6)
r=r+15*(36-31)

shareCalculation(r,36, [36], mini_naio.iloc[36,0],7)
r=r+15

for i in range(37,41):
     shareCalculation(r + (i-37) * 15, i, range(37,41),mini_naio.iloc[i,0],8)
r=r+15*(41-37)

for i in range(41,44):
     shareCalculation(r + (i-41) * 15, i, range(41,44),mini_naio.iloc[i,0],9)
r=r+15*(44-41)

for i in range(44,46):
     shareCalculation(r + (i-44) * 15, i, range(44,46),mini_naio.iloc[i,0],10)
r=r+15*(46-44)

for i in range(46,51):
     shareCalculation(r + (i-46) * 15, i, range(46,51),mini_naio.iloc[i,0],11)
r=r+15*(51-46)

for i in range(51,56):
     shareCalculation(r + (i-51) * 15, i, range(51,56),mini_naio.iloc[i,0],12)
r=r+15*(56-51)

shareCalculation(r,56, [56], mini_naio.iloc[56,0],13)
r=r+15

for i in range(57,59):
     shareCalculation(r + (i-57) * 15, i, range(57,59),mini_naio.iloc[i,0],14)
r=r+15*(59-57)

# using here range(59,65) instead of range(59,66) to exclude the last row sector whose NaN produces 
# a NaN result also for the weight of the other sectors of the same group, being NaN the sum of the 
# Compensations of employees
for i in range(59,66):
     shareCalculation(r + (i-59) * 15, i, range(59,65),mini_naio.iloc[i,0],15)
        

In [13]:
pd.set_option('display.float_format', '{:.10f}'.format)
pd.set_option("display.max_rows", None)
#pd.set_option("display.max_rows", 100)

In [14]:
# ff output 
ff.to_csv("ff_base.csv",index=False)

In [15]:
ff['L min'] = ff['dimensional class'].apply(
    lambda x: '250' if x.startswith('>') else x.split('-')[0])
ff.loc[ff['L min'] == '0', 'L min'] = '1'

ff['L max'] = ff['dimensional class'].apply(
    lambda x: '2000' if x.startswith('>') else x.split('-')[1]) # 2000 is TMP (find a more reasoned cap)


# when considering agriculture we must achieve the total number of 19 mln of workers 
# i.e. 10 mln of agricultural firms on 20 mln of firms - eurostat
# assumption of agricultural firms on istat and eurostat data, this justifies L-max = 3
# while for the other sectors we consider the avg values of each dimensional class
ff.loc[:8, "L max"] = '3'

ff.to_csv("ff_with_class_limits.csv",index=False)


ff['Firms in absolute numbers'] = ff['Share of firms'] * numberOfFirms 
# European data in real world -> CAVEAT: != number of firms in the model
centreOfClass = pd.to_numeric(ff['L min']) + (pd.to_numeric(ff['L max']) - pd.to_numeric(ff['L min']))/2
ff['Workers per class in absolute numbers'] = centreOfClass * ff['Firms in absolute numbers']    

grouped_ff = ff.groupby('NACE definition')

def sum_without_ge250(group):
    filtered_group = group[group['dimensional class'] != '>=250']
    return filtered_group.iloc[:, -1].sum()

total_workers_without_ge250 = grouped_ff.apply(sum_without_ge250)
ff['Total Workers without >= 250 class'] = ff['NACE definition'].map(total_workers_without_ge250)

def sum_only_ge250(group):
    filtered_group = group[group['dimensional class'] == '>=250']
    return filtered_group.iloc[:, -3].sum() # -3 refers to the col ff['Workers per class in absolute numbers']

total_firms_only_ge250 = grouped_ff.apply(sum_only_ge250)
ff['Firms only >= 250 class'] = ff['NACE definition'].map(total_firms_only_ge250)




condition = ff['Share of firms per sbs sector'] == 1
rows_to_update = range(13, len(ff), 5) # 13 is the first appearance of "L max >= 250"
mask = (ff['Firms only >= 250 class'] != 0) & condition & ff.index.isin(rows_to_update)

ff.loc[mask, "L max"] = (
    (ff["sbs reference row"].map(sbs["Total"]) - ff['Total Workers without >= 250 class'])
    * 2 / ff['Firms only >= 250 class'].replace(0, np.nan)  
    ) - pd.to_numeric(ff['L min'])



condition2 = ff['Share of firms per sbs sector'] != 1
mask2 = (ff['Firms only >= 250 class'] != 0) & condition2 & ff.index.isin(rows_to_update)

#ff['Share of firms per sbs sector'] * sbs["Total"]
#ff['Share of firms per sbs sector'] * ff['Total Workers without >= 250 class']
"""
ff.loc[mask2, "L max"] = (
    ((ff["sbs reference row"].map(sbs["Total"]) * ff['Share of firms per sbs sector'])
     - (ff['Share of firms per sbs sector'] * ff['Total Workers without >= 250 class'])
    * 2 / ff['Firms only >= 250 class'].replace(0, np.nan)  
    ) - pd.to_numeric(ff['L min'])  
"""

ff.loc[mask2, "L max"] = (
    (
        (ff["sbs reference row"].map(sbs["Total"]) * ff['Share of firms per sbs sector']) 
        - (ff['Share of firms per sbs sector'] * ff['Total Workers without >= 250 class'])
    ) * 2 / ff['Firms only >= 250 class'].replace(0, np.nan)
) - pd.to_numeric(ff['L min'])


        
ff.loc[pd.isna(ff['Workers per class in absolute numbers']), 'L max'] = np.nan

/var/folders/wk/ylph8q1x10q_dvsr9mzx4r1r0000gn/T/ipykernel_26459/991266303.py:27: DeprecationWarning: 
DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future 
version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` 
to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.
  total_workers_without_ge250 = grouped_ff.apply(sum_without_ge250)
/var/folders/wk/ylph8q1x10q_dvsr9mzx4r1r0000gn/T/ipykernel_26459/991266303.py:34: DeprecationWarning: 
DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future 
version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` 
to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.
  total_firms_only_ge250 = grouped_ff.apply(sum_only_ge250)

In [16]:
ff.to_csv("ff_correction_upper_limit.csv",index=False)
ff

Unnamed: 0,#,NACE definition,type,dimensional class,Share of firms,Share of firms per sbs sector,sbs reference row,L min,L max,Firms in absolute numbers,Workers per class in absolute numbers,Total Workers without >= 250 class,Firms only >= 250 class
0,1,Agriculture,C,0-9,0.1047371609,0.8106294167,,1,3.0,2173408.261909958,4346816.523819916,16212588.334316172,0.0
1,1,Agriculture,I,0-9,0.0054672101,0.8106294167,,1,3.0,113450.4645787184,226900.9291574367,16212588.334316172,0.0
2,1,Agriculture,Int,0-9,0.2804402453,0.8106294167,,1,3.0,5819435.44066941,11638870.88133882,16212588.334316172,0.0
3,2,Silviculture,C,0-9,0.010509848,0.1472252271,,1,3.0,218090.6030778887,436181.2061557773,2944504.541310155,0.0
4,2,Silviculture,I,0-9,0.0006367267,0.1472252271,,1,3.0,13212.760890288,26425.521780576,2944504.541310155,0.0
5,2,Silviculture,Int,0-9,0.05980168,0.1472252271,,1,3.0,1240948.9066869006,2481897.8133738013,2944504.541310155,0.0
6,3,Fishing,C,0-9,0.0100038502,0.0421453562,,1,3.0,207590.6060168938,415181.2120337876,842907.1243736735,0.0
7,3,Fishing,I,0-9,5.24446e-05,0.0421453562,,1,3.0,1088.2822064855,2176.5644129711,842907.1243736735,0.0
8,3,Fishing,Int,0-9,0.0102536719,0.0421453562,,1,3.0,212774.6739634574,425549.3479269148,842907.1243736735,0.0
9,4,Mining and quarrying,C,0-9,1.02362e-05,1.0,0.0,1,9.0,212.4118560392,1062.059280196,155270.4114285714,213.0
