# NAICS
North American Industry Classification Codes System

David Tersegno

4/15/22


A brief look at [this code system](https://www2.census.gov/programs-surveys/cbp/technical-documentation/reference/naics-descriptions/naics2017.txt) for relevant industries for our project. We'll see how many warehouses, storage, logistics, shipping, etc. are in the associated data later. 
the file had to be resaved as utf8. It originally came as ANSI, which jupyter doesn't play well with.

The data we hope to apply this to:
[Data here](https://www.census.gov/programs-surveys/cbp/data/datasets.html),
[FTP server for entire census data archive here](https://www2.census.gov/)

This notebook confirms the NAICS codes for warehouse business types.



## Import libraries and data

In [1]:
import pandas as pd
import numpy as np

In [2]:
code_file_path = '../raw_data/naics2017_UTF8.txt'
naics = pd.read_csv(code_file_path)

In [3]:
naics

Unnamed: 0,NAICS,DESCRIPTION
0,------,Total for all sectors
1,11----,"Agriculture, Forestry, Fishing and Hunting"
2,113///,Forestry and Logging
3,1131//,Timber Tract Operations
4,11311/,Timber Tract Operations
...,...,...
1998,81394/,Political Organizations
1999,813940,Political Organizations
2000,81399/,"Other Similar Organizations (except Business, ..."
2001,813990,"Other Similar Organizations (except Business, ..."


[This report](https://www.epipeline.com/mktng/nl-articles/general-warehousing-and-storage-2015.html) has a short list of relevant codes. It's focused on 493110: General Warehousing and Storage. It also refers to

> Cross References:Renting or leasing space for self storage--are classified in Industry 531130, Lessors of Miniwarehouses and Self-Storage Units; and

>Selling in combination with handling and/or distributing goods to other wholesale or retail establishments--are classified in Sector 42, Wholesale Trade.

In [4]:
naics[naics['NAICS'] == '493110']

Unnamed: 0,NAICS,DESCRIPTION
1282,493110,General Warehousing and Storage


In [5]:
#cool. keep track of its index 
naics_list =[naics[naics['NAICS'] == '493110'].index[0]]
naics_list

[1282]

In [6]:
naics[naics['NAICS'] == '531130']

Unnamed: 0,NAICS,DESCRIPTION
1451,531130,Lessors of Miniwarehouses and Self-Storage Units


In [7]:
# Not sure if this is for us, but for now, keep track of it
naics_list.append(1451)
naics_list

[1282, 1451]

In [8]:
#make a copy of the original, because I'm gonna start cutting through this for anything else of relevance.
naics_orig = naics.copy()

In [9]:
#removes the top 20 entries in the dataframe and prints out the top 20 of the result.
def chopper(dataframe, number_to_cut = 20):
    this_index_list = dataframe.index
    chop_these_indices = this_index_list[:number_to_cut]
    dataframe.drop(chop_these_indices, inplace = True)
    return dataframe.head(number_to_cut)
    

In [25]:
naics_orig['starts_with_49'] = naics_orig.apply(lambda row: row['NAICS'].startswith('49'), axis = 1)

In [26]:
naics_orig['starts_with_49'].sum()

17

In [27]:
naics_orig[naics_orig['starts_with_49']]

Unnamed: 0,NAICS,DESCRIPTION,starts_with_49
1272,492///,Couriers and Messengers,True
1273,4921//,Couriers and Express Delivery Services,True
1274,49211/,Couriers and Express Delivery Services,True
1275,492110,Couriers and Express Delivery Services,True
1276,4922//,Local Messengers and Local Delivery,True
1277,49221/,Local Messengers and Local Delivery,True
1278,492210,Local Messengers and Local Delivery,True
1279,493///,Warehousing and Storage,True
1280,4931//,Warehousing and Storage,True
1281,49311/,General Warehousing and Storage,True


For now we can focus on warehouses.

It looks like our codes are `493///` for Warehousing and Storage.
`4931//` is the only subset of the above. (like taxonomy -- _sapiens_ is the only living species in _homo_)

- `49311/`, `493110` General
- `49312/`, `493120` Refrigerated
- `49313/`, `493130` Farm Products
- `49319/`, `493190` Other

In [4]:
naics[naics['NAICS']=='493///']

Unnamed: 0,NAICS,DESCRIPTION
1279,493///,Warehousing and Storage


In [8]:
naics[naics['NAICS']=='81----']

Unnamed: 0,NAICS,DESCRIPTION
1908,81----,Other Services (except Public Administration)


In [9]:
naics[naics['NAICS']=='23----']

Unnamed: 0,NAICS,DESCRIPTION
109,23----,Construction


In [10]:
naics[naics['NAICS']=='44----']

Unnamed: 0,NAICS,DESCRIPTION
995,44----,Retail Trade


In [11]:
naics[naics['NAICS']=='238///']

Unnamed: 0,NAICS,DESCRIPTION
139,238///,Specialty Trade Contractors


In [None]:
naics[naics['NAICS']=='238///']

## Next Step

The following notebook, [Summing warehouse businesses by zip code with ZBP data](./4_summarizingZipsandWarehouses.ipynb), takes the ZBP census data and uses it to find the total number of warehouse businesses registered with each zip code in California.