# USPTO Data Minning

With this notebook and the uspto package you can parse the XML raw trademark data from the provided by USPTO.

### Loading packages 

In [None]:
import pandas as pd
import uspto as pto

### Open USPTO File

In [None]:
# Path to data
path = "/home/jlroo/marketing/data/apc161231-56.xml"
data = pto.openUSPTO(path)

### Get XML root

Getting the root might take a couple of minutes depending on size of the XML file and the RAM of your machine.

In [None]:
data = pto.openUSPTO(path)
root = data.getroot()

### File Description

With the `pto.getDetails(root)` function we can extract useful information about the XML file also the volume of the trademark applications on the file.

In [None]:
details = pto.getDetails(root)
pd.DataFrame.from_dict(details,orient='index')

# Extracting and Creating tables

## Case File Header 

Extract the case file header data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
file_header = pto.getFileHeader(root)

In [None]:
table = pd.DataFrame.from_dict(file_header, orient='index')
table.head()

In [None]:
table.to_csv("casefileHeader.csv")

## Case File Classification

Extract the case file classification data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
classifications = pto.getClassifications(root)

In [None]:
data = []
for k in classifications.keys():
    for d in classifications[k]:
        data.append(classifications[k][d])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("classifications.csv")

## Case File Classification Codes

Extract the case file classification codes from the XML file, this table can also be obtanied from the classification table. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
classification_codes = pto.getClassificationCodes(root)

In [None]:
data = []
for k in classification_codes.keys():
    for d in classification_codes[k]:
        data.append(classification_codes[k][d])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("classification_codes.csv")

## Case File Design Search

Extract the case file Design Search data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
design = pto.getDesignSearch(root)

In [None]:
data = []
for k in design.keys():
    for d in design[k]:
        data.append(design[k][d])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("designSearch.csv")

## Case File Owners

Extract the case file owners data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
owners = pto.getFileOwners(root)

In [None]:
data = []
for k in owners.keys():
    for d in owners[k]:
        data.append(owners[k][d])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("fileOwners.csv")

## Case File Statements

Extract the case file statements data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
statements = pto.getFileStatements(root)

In [None]:
data = []
for k in statements.keys():
    for d in statements[k]:
        data.append(statements[k][d])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("fileStatements.csv")

## Case File Foreign Applications

Extract the case file Foreign Applications data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
foreign = pto.getForeignApplications(root)

In [None]:
data = []
for k in foreign.keys():
    for d in foreign[k]:
        data.append(foreign[k][d])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("foreignApplications.csv")

## Case File Prior Applications

Extract the case file Prior Applications data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
prior = pto.getPriorApplications(root)

In [None]:
data = []
for k in prior.keys():
    for d in prior[k]:
        data.append(prior[k][d])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("priorApplications.csv")

## Case File Events

Extract the case file events data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
events = pto.getFileEvent(root)

In [None]:
data = []
for k in events.keys():
    for d in events[k]:
        data.append(events[k][d])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("fileEvent.csv")

## Case File Correspondent

Extract the case file correspondent data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
correspondent = pto.getCorrespondent(root)

In [None]:
data = []
for k in correspondent.keys():
        data.append(correspondent[k])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("correspondent.csv")

## Case File Madrid Filing

Extract the case file Madrid Filing data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
madrid_filing = pto.getMadridFiling(root)

In [None]:
data = []
for k in madrid_filing.keys():
    data.append(madrid_filing[k])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("madridFiling.csv")

## Case File Madrid Events

Extract the case file Madrid Events data from the XML file. This function creates a dictionary that can be transform as a table using Pandas.

In [None]:
madrid_events = pto.getMadridEvents(root)

In [None]:
data = []
for k in madrid_events.keys():
    for d in madrid_events[k]:
        data.append(madrid_events[k][d])

In [None]:
table = pd.DataFrame(data)
table.head()

In [None]:
table.to_csv("madridEvents.csv")