# Title to SOC Mapping
***
The goal of this sample is to demonstrate a simple method for taking a job title, matching it to Emsi's job title library, then getting an appropriate SOC code that fits.

In [1]:
# import and create the connection
import json
import EmsiApiPy
conn = EmsiApiPy.UnitedStatesPostingsConnection()

In [2]:
# given a raw title (Data Scientist in this example), map this to Emsi's Job Title Library
querystring = {
    "title_version": "emsi",  # specify the Emsi job title library (more comprehensive)
    "limit": 3   # we're just going to limit to 3 results for the sake of this example
}

data = conn.get_taxonomies(
    facet = "title",
    q = "Data Scientists",  # this is the raw input
    querystring = querystring
)
print(json.dumps(data, indent=4))

[
    {
        "id": "ET3B93055220D592C8",
        "name": "Data Scientists",
        "properties": {
            "singular_name": "Data Scientist",
            "unique_postings": 121043
        },
        "score": 6205.621
    },
    {
        "id": "ET5F758027D5A9C1D1",
        "name": "Principal Data Scientists",
        "properties": {
            "singular_name": "Principal Data Scientist",
            "unique_postings": 8057
        },
        "score": 137.40694
    },
    {
        "id": "ETB15B6675998124CE",
        "name": "Lead Data Scientists",
        "properties": {
            "singular_name": "Lead Data Scientist",
            "unique_postings": 7254
        },
        "score": 135.97774
    }
]


We can see from the results from the API that the first record is the best match (and it should typically be the best match). We will use that title to get the top SOC code for Budget Analysts.

In [3]:
emsi_title_id = data[0]["id"]

# once again, specify that we want to use the Emsi title library
querystring = {"title_version": "emsi"}

payload = {
  "filter": {
    "when": "active",  # only look at the active postings for this example
    "title": [emsi_title_id]  # limit to job postings with this emsi title
  },
  "rank": {
    "by": "unique_postings",
    "limit": 5  # limit to the top 5 occupation results
  }
}

df = conn.post_rankings_df(
    facet = "soc5_name",  # we'll get the results based on the name, since that is more readable
    querystring = querystring,
    payload = payload
)
df.head()

Unnamed: 0,soc5_name,unique_postings
0,Computer and Information Research Scientists,4633
1,Market Research Analysts and Marketing Special...,305
2,"Medical Scientists, Except Epidemiologists",236
3,Management Analysts,132
4,Unclassified Occupation,102


From these results, we can see that typicaly Data Scientists map to "Computer and Information Research Scientists ([SOC 15-1221](https://www.bls.gov/oes/current/oes151221.htm))"