# XML Data Collection Code Template

## Purpose

This notebook is meant to be used to extract particular variables from the XML file from the trademark application. The application's XML format is described at https://www.uspto.gov/sites/default/files/products/tmdailyapp-documentation.pdf. 

## Instructions for Running This Notebook

1. All XML files for processing should be located within the same folder as this Jupyter Notebook.
2. Please insert the names of the files as a list of strings in the `files_to_parse` variable, with an example provided below.
2. If you'd like to write the results to a CSV file, change the `write_to_csv` variable to `True`, and provide the name of the resulting csv in `csv_file_name`. Please note if you provide an existing file, this program will overwrite its contents. The file will appear in the same folder as this Jupyter Notebook.
3. Run each code cell of this notebook in order. You can also go to the top bar `Cell -> Run All`.

In [1]:
files_to_parse = ['apc200212.xml', ]  # Replace this line with all the XML file names in this format.

# Replace sample.csv with the name of the CSV file you'd like to create.
# csv_file_name doesn't matter if write_to_csv is False.
write_to_csv = False
csv_file_name = 'sample.csv'

## Code

In [2]:
from xml.etree import ElementTree as ET
import csv

if write_to_csv:
    file = open(csv_file_name, 'w', newline='')
    writer = csv.writer(file)
    writer.writerow(['Serial Number', 'Code', 'Type'])

for file_name in files_to_parse:
    source = ET.parse(file_name)

    trademark_applications_daily = source.getroot()

    # print every application information
    for application_information in trademark_applications_daily.findall('application-information'):
        for file_segments in application_information.findall('file-segments'):
            for action_keys in file_segments.findall('action-keys'):

                for case_file in action_keys.findall('case-file'):
                    serial_num = case_file.find('serial-number').text
                    event_statement = case_file.find('case-file-event-statements')
                    if event_statement:
                        event_statement = event_statement.find('case-file-event-statement')
                        code = event_statement.find('code').text
                        type_var = event_statement.find('type').text
                    else:
                        code, type_var = None, None
                    
                    if write_to_csv:
                        writer.writerow([serial_num, code, type_var])
                    else:
                        print(f"Serial Number {serial_num}, Code {code}, Type {type_var}")

if write_to_csv:
    file.close()

Serial Number 73253383, Code ARAA, Type I
Serial Number 73729747, Code NA89, Type E
Serial Number 74153456, Code NA89, Type E
Serial Number 74183674, Code ARAA, Type I
Serial Number 75416235, Code A7OK, Type O
Serial Number 76086342, Code NA89, Type O
Serial Number 76287591, Code CHAN, Type I
Serial Number 76331837, Code CHAN, Type I
Serial Number 76429972, Code NA89, Type E
Serial Number 77139171, Code TCCA, Type I
Serial Number 77871488, Code TCCA, Type I
Serial Number 77937845, Code GNRN, Type O
Serial Number 77937852, Code GNFN, Type O
Serial Number 77937863, Code GNFN, Type O
Serial Number 78943015, Code NA89, Type E
Serial Number 85056816, Code NA85, Type E
Serial Number 85079984, Code NA85, Type E
Serial Number 85080230, Code NA85, Type E
Serial Number 85408638, Code NAS8, Type E
Serial Number 85447511, Code NA85, Type E
Serial Number 85734056, Code NA85, Type E
Serial Number 85870924, Code APET, Type A
Serial Number 85902016, Code R.PR, Type A
Serial Number 85902022, Code R.PR,

Serial Number 77585697, Code RNL1, Type Q
Serial Number 77587171, Code PR89, Type O
Serial Number 77587681, Code E89R, Type I
Serial Number 77587729, Code E89R, Type I
Serial Number 77591825, Code PR89, Type O
Serial Number 77592907, Code PR89, Type O
Serial Number 77592909, Code PR89, Type O
Serial Number 77592914, Code PR89, Type O
Serial Number 77592973, Code E89R, Type I
Serial Number 77593021, Code E89R, Type I
Serial Number 77593874, Code TCCA, Type I
Serial Number 77595248, Code NA89, Type E
Serial Number 77595329, Code E89R, Type I
Serial Number 77595599, Code TCCA, Type I
Serial Number 77595961, Code NA89, Type E
Serial Number 77595999, Code E89R, Type I
Serial Number 77596932, Code EROP, Type I
Serial Number 77597340, Code EROP, Type I
Serial Number 77597392, Code PR89, Type O
Serial Number 77598028, Code ABN4, Type O
Serial Number 77598989, Code RNL1, Type Q
Serial Number 77601176, Code E89R, Type I
Serial Number 77601340, Code C8.T, Type O
Serial Number 77603130, Code RNL1,

Serial Number 79251244, Code OPNS, Type P
Serial Number 79251292, Code FICS, Type P
Serial Number 79251304, Code CHPB, Type I
Serial Number 79251363, Code CNSA, Type O
Serial Number 79251457, Code XXSS, Type O
Serial Number 79251473, Code FICR, Type P
Serial Number 79251487, Code OPNS, Type P
Serial Number 79251505, Code CNSA, Type O
Serial Number 79251518, Code FICS, Type P
Serial Number 79251557, Code XXSS, Type O
Serial Number 79251606, Code FIMP, Type P
Serial Number 79251667, Code FICR, Type P
Serial Number 79251675, Code TCCA, Type I
Serial Number 79251689, Code FICS, Type P
Serial Number 79251697, Code FIMP, Type P
Serial Number 79251710, Code FICR, Type P
Serial Number 79251736, Code CNSA, Type O
Serial Number 79251775, Code XXSS, Type O
Serial Number 79251789, Code OPNS, Type P
Serial Number 79251800, Code MAB2, Type E
Serial Number 79251832, Code OPNS, Type P
Serial Number 79251884, Code FICR, Type P
Serial Number 79251887, Code OPNS, Type P
Serial Number 79251926, Code FICR,

Serial Number 85439593, Code TCCA, Type I
Serial Number 85440592, Code NA85, Type E
Serial Number 85442308, Code PR23, Type O
Serial Number 85444400, Code E815, Type I
Serial Number 85449611, Code TCCA, Type I
Serial Number 85451253, Code ARAA, Type I
Serial Number 85454084, Code TCCA, Type I
Serial Number 85454095, Code TCCA, Type I
Serial Number 85455871, Code ARAA, Type I
Serial Number 85459263, Code NA85, Type E
Serial Number 85459305, Code PR23, Type O
Serial Number 85460141, Code NA85, Type E
Serial Number 85462425, Code NA85, Type E
Serial Number 85465596, Code EROP, Type I
Serial Number 85467281, Code E815, Type I
Serial Number 85467920, Code ES8R, Type I
Serial Number 85467975, Code C15A, Type O
Serial Number 85470751, Code CNSA, Type O
Serial Number 85473114, Code ES8R, Type I
Serial Number 85473999, Code E815, Type I
Serial Number 85474067, Code TCCA, Type I
Serial Number 85474118, Code EROP, Type I
Serial Number 85474226, Code NAS8, Type E
Serial Number 85476255, Code E815,

Serial Number 87067178, Code CNPR, Type P
Serial Number 87067590, Code EX3G, Type S
Serial Number 87069442, Code EXRA, Type E
Serial Number 87069635, Code TEME, Type I
Serial Number 87070906, Code EEXT, Type I
Serial Number 87070959, Code EX4G, Type S
Serial Number 87070961, Code EX3G, Type S
Serial Number 87071429, Code NREV, Type O
Serial Number 87072230, Code ES7R, Type I
Serial Number 87072238, Code NONP, Type E
Serial Number 87072698, Code TCCA, Type I
Serial Number 87072783, Code CHAN, Type I
Serial Number 87072792, Code CHAN, Type I
Serial Number 87073888, Code EXRA, Type E
Serial Number 87075141, Code TROA, Type I
Serial Number 87075475, Code TCCA, Type I
Serial Number 87076622, Code EX5G, Type S
Serial Number 87077221, Code TCCA, Type I
Serial Number 87077724, Code EXRA, Type E
Serial Number 87078472, Code EISU, Type I
Serial Number 87078478, Code EISU, Type I
Serial Number 87078632, Code EISU, Type I
Serial Number 87078773, Code EXRA, Type E
Serial Number 87078799, Code PETC,

Serial Number 87843503, Code EXRA, Type E
Serial Number 87843662, Code GNS3, Type O
Serial Number 87843714, Code EX3G, Type S
Serial Number 87843960, Code AITU, Type A
Serial Number 87844150, Code SUNA, Type E
Serial Number 87844712, Code ALIE, Type A
Serial Number 87845111, Code EX2G, Type S
Serial Number 87846061, Code CNSA, Type P
Serial Number 87846102, Code EXRA, Type E
Serial Number 87846409, Code NONP, Type E
Serial Number 87846865, Code EX3G, Type S
Serial Number 87847147, Code GNRN, Type O
Serial Number 87847786, Code SUPC, Type I
Serial Number 87847825, Code NONP, Type E
Serial Number 87847840, Code NONP, Type E
Serial Number 87848029, Code EX2G, Type S
Serial Number 87848035, Code EX2G, Type S
Serial Number 87848042, Code EX2G, Type S
Serial Number 87848664, Code NONP, Type E
Serial Number 87848673, Code NONP, Type E
Serial Number 87848877, Code TEME, Type I
Serial Number 87849532, Code CNPR, Type P
Serial Number 87849637, Code CNSA, Type P
Serial Number 87849679, Code EPPA,

Serial Number 88070709, Code NONP, Type E
Serial Number 88070718, Code ERSI, Type I
Serial Number 88070734, Code EXRA, Type E
Serial Number 88070749, Code NONP, Type E
Serial Number 88070828, Code NOAM, Type E
Serial Number 88070898, Code NONP, Type E
Serial Number 88071376, Code NONP, Type E
Serial Number 88071748, Code NONP, Type E
Serial Number 88071829, Code NONP, Type E
Serial Number 88071939, Code ERFR, Type I
Serial Number 88072045, Code ETOF, Type T
Serial Number 88072060, Code EXRA, Type E
Serial Number 88072153, Code NONP, Type E
Serial Number 88072492, Code NONP, Type E
Serial Number 88072795, Code ERTD, Type I
Serial Number 88072802, Code TCCA, Type I
Serial Number 88072996, Code NONP, Type E
Serial Number 88073290, Code TROA, Type I
Serial Number 88073373, Code AITU, Type A
Serial Number 88073484, Code EXRA, Type E
Serial Number 88073894, Code EXRA, Type E
Serial Number 88073900, Code EXRA, Type E
Serial Number 88073910, Code EPPA, Type I
Serial Number 88073977, Code EISU,

Serial Number 88288135, Code SUNA, Type E
Serial Number 88288178, Code SUPC, Type I
Serial Number 88288198, Code SUNA, Type E
Serial Number 88288281, Code EISU, Type I
Serial Number 88288314, Code SUPC, Type I
Serial Number 88288360, Code SUPC, Type I
Serial Number 88288648, Code NONP, Type E
Serial Number 88288932, Code SUPC, Type I
Serial Number 88289072, Code SUNA, Type E
Serial Number 88289100, Code EX1G, Type S
Serial Number 88289233, Code ARAA, Type I
Serial Number 88289315, Code CNPR, Type P
Serial Number 88289397, Code EXRA, Type E
Serial Number 88289468, Code CNPR, Type P
Serial Number 88289519, Code NONP, Type E
Serial Number 88289759, Code CNPR, Type P
Serial Number 88290207, Code RCCK, Type S
Serial Number 88290252, Code SUPC, Type I
Serial Number 88290261, Code SUPC, Type I
Serial Number 88290362, Code GNRN, Type O
Serial Number 88290422, Code CNPR, Type P
Serial Number 88290425, Code NONP, Type E
Serial Number 88290426, Code SUNA, Type E
Serial Number 88290465, Code EXRA,

Serial Number 88406221, Code RCSC, Type S
Serial Number 88406249, Code NONP, Type E
Serial Number 88406334, Code TROA, Type I
Serial Number 88406348, Code NONP, Type E
Serial Number 88406366, Code TROA, Type I
Serial Number 88406543, Code NONP, Type E
Serial Number 88406577, Code NONP, Type E
Serial Number 88406590, Code NONP, Type E
Serial Number 88406681, Code NONP, Type E
Serial Number 88406719, Code CNSA, Type O
Serial Number 88406853, Code NONP, Type E
Serial Number 88406910, Code NONP, Type E
Serial Number 88406944, Code NONP, Type E
Serial Number 88406948, Code NONP, Type E
Serial Number 88406954, Code NONP, Type E
Serial Number 88406959, Code CNSA, Type P
Serial Number 88406980, Code EX1G, Type S
Serial Number 88406995, Code NONP, Type E
Serial Number 88407005, Code NONP, Type E
Serial Number 88407008, Code NONP, Type E
Serial Number 88407009, Code NONP, Type E
Serial Number 88407011, Code NONP, Type E
Serial Number 88407016, Code NONP, Type E
Serial Number 88407047, Code NONP,

Serial Number 88486964, Code NONP, Type E
Serial Number 88486993, Code CNSA, Type O
Serial Number 88487424, Code NONP, Type E
Serial Number 88487638, Code PBCR, Type Z
Serial Number 88487657, Code ALIE, Type A
Serial Number 88487761, Code TROA, Type I
Serial Number 88487773, Code AUPC, Type I
Serial Number 88487864, Code NONP, Type E
Serial Number 88487867, Code SUNA, Type E
Serial Number 88487902, Code ZZZX, Type Z
Serial Number 88488074, Code NONP, Type E
Serial Number 88488098, Code NONP, Type E
Serial Number 88488116, Code NONP, Type E
Serial Number 88488141, Code EISU, Type I
Serial Number 88488146, Code NONP, Type E
Serial Number 88488164, Code SUPC, Type I
Serial Number 88488165, Code TEME, Type I
Serial Number 88488170, Code TROA, Type I
Serial Number 88488190, Code EISU, Type I
Serial Number 88488230, Code EISU, Type I
Serial Number 88488393, Code SUNA, Type E
Serial Number 88488426, Code EAAU, Type I
Serial Number 88488460, Code NONP, Type E
Serial Number 88488474, Code NREV,

Serial Number 88609656, Code NONP, Type E
Serial Number 88609681, Code DPCC, Type D
Serial Number 88609779, Code TCCA, Type I
Serial Number 88609857, Code ETOF, Type T
Serial Number 88609924, Code NONP, Type E
Serial Number 88610178, Code NONP, Type E
Serial Number 88610184, Code NONP, Type E
Serial Number 88610243, Code CNSA, Type O
Serial Number 88610385, Code OP.I, Type T
Serial Number 88610389, Code NONP, Type E
Serial Number 88610480, Code NONP, Type E
Serial Number 88610484, Code ETOF, Type T
Serial Number 88610491, Code NONP, Type E
Serial Number 88610586, Code NONP, Type E
Serial Number 88610707, Code NONP, Type E
Serial Number 88610743, Code NONP, Type E
Serial Number 88610865, Code NONP, Type E
Serial Number 88610894, Code NONP, Type E
Serial Number 88610896, Code GNS3, Type O
Serial Number 88610987, Code TCCA, Type I
Serial Number 88610992, Code ETOF, Type T
Serial Number 88611103, Code CNSA, Type P
Serial Number 88611109, Code NONP, Type E
Serial Number 88611138, Code NONP,

Serial Number 88660769, Code ALIE, Type A
Serial Number 88660775, Code NONP, Type E
Serial Number 88660779, Code NONP, Type E
Serial Number 88660786, Code NONP, Type E
Serial Number 88660789, Code CNSA, Type O
Serial Number 88660791, Code NONP, Type E
Serial Number 88660794, Code NONP, Type E
Serial Number 88660799, Code NONP, Type E
Serial Number 88660801, Code NONP, Type E
Serial Number 88660805, Code ALIE, Type A
Serial Number 88660826, Code ALIE, Type A
Serial Number 88660830, Code NONP, Type E
Serial Number 88660834, Code ALIE, Type A
Serial Number 88660839, Code NONP, Type E
Serial Number 88660866, Code NONP, Type E
Serial Number 88660877, Code NONP, Type E
Serial Number 88660881, Code NONP, Type E
Serial Number 88660887, Code NONP, Type E
Serial Number 88660893, Code NONP, Type E
Serial Number 88660895, Code TROA, Type I
Serial Number 88660908, Code NONP, Type E
Serial Number 88660911, Code NONP, Type E
Serial Number 88660913, Code NONP, Type E
Serial Number 88660914, Code GNFN,

Serial Number 88672980, Code ALIE, Type A
Serial Number 88672991, Code ALIE, Type A
Serial Number 88672996, Code ALIE, Type A
Serial Number 88673000, Code ALIE, Type A
Serial Number 88673009, Code ALIE, Type A
Serial Number 88673013, Code CNSA, Type O
Serial Number 88673025, Code TEME, Type I
Serial Number 88673049, Code GNRN, Type O
Serial Number 88673067, Code GNRN, Type O
Serial Number 88673084, Code ALIE, Type A
Serial Number 88673085, Code CNSA, Type O
Serial Number 88673121, Code ALIE, Type A
Serial Number 88673125, Code ALIE, Type A
Serial Number 88673127, Code TEME, Type I
Serial Number 88673133, Code CNSA, Type O
Serial Number 88673144, Code CNSA, Type O
Serial Number 88673155, Code ALIE, Type A
Serial Number 88673157, Code CNSA, Type P
Serial Number 88673163, Code ALIE, Type A
Serial Number 88673175, Code GNRN, Type O
Serial Number 88673179, Code TEME, Type I
Serial Number 88673180, Code TROA, Type I
Serial Number 88673216, Code GNRN, Type O
Serial Number 88673223, Code GNRN,

Serial Number 88683651, Code GNRN, Type O
Serial Number 88683662, Code GNRN, Type O
Serial Number 88683676, Code GNRN, Type O
Serial Number 88683681, Code XAEC, Type I
Serial Number 88683684, Code GNRN, Type O
Serial Number 88683685, Code GNRN, Type O
Serial Number 88683688, Code GNRN, Type O
Serial Number 88683689, Code GNRN, Type O
Serial Number 88683693, Code GEAN, Type O
Serial Number 88683694, Code CNSA, Type O
Serial Number 88683697, Code CNSA, Type O
Serial Number 88683699, Code GNRN, Type O
Serial Number 88683700, Code GNRN, Type O
Serial Number 88683709, Code DOCK, Type D
Serial Number 88683710, Code CNSA, Type P
Serial Number 88683711, Code CNSA, Type P
Serial Number 88683719, Code CNSA, Type P
Serial Number 88683722, Code GNS3, Type O
Serial Number 88683727, Code CNSA, Type O
Serial Number 88683728, Code CNRT, Type F
Serial Number 88683733, Code GNRN, Type O
Serial Number 88683736, Code CNSA, Type O
Serial Number 88683737, Code CNSA, Type P
Serial Number 88683738, Code GNRN,

Serial Number 88716562, Code NONP, Type E
Serial Number 88716684, Code TCCA, Type I
Serial Number 88716694, Code NONP, Type E
Serial Number 88716716, Code ALIE, Type A
Serial Number 88716727, Code PARI, Type I
Serial Number 88716794, Code NONP, Type E
Serial Number 88716825, Code NONP, Type E
Serial Number 88716863, Code NONP, Type E
Serial Number 88716911, Code NONP, Type E
Serial Number 88716923, Code CNSA, Type P
Serial Number 88716942, Code DOCK, Type D
Serial Number 88716947, Code NONP, Type E
Serial Number 88716954, Code GNRN, Type O
Serial Number 88716961, Code CNSA, Type P
Serial Number 88716983, Code NONP, Type E
Serial Number 88717026, Code TEME, Type I
Serial Number 88717208, Code MDSC, Type E
Serial Number 88717421, Code ARAA, Type I
Serial Number 88717454, Code NONP, Type E
Serial Number 88717474, Code DOCK, Type D
Serial Number 88717476, Code DOCK, Type D
Serial Number 88717509, Code NONP, Type E
Serial Number 88717514, Code NONP, Type E
Serial Number 88717575, Code NONP,

Serial Number 88789062, Code NWOS, Type I
Serial Number 88789063, Code NWOS, Type I
Serial Number 88789064, Code MDSC, Type E
Serial Number 88789065, Code NWOS, Type I
Serial Number 88789066, Code NWOS, Type I
Serial Number 88789068, Code NWOS, Type I
Serial Number 88789069, Code NWOS, Type I
Serial Number 88789070, Code NWOS, Type I
Serial Number 88789071, Code NWOS, Type I
Serial Number 88789072, Code MDSC, Type E
Serial Number 88789073, Code MDSC, Type E
Serial Number 88789074, Code MDSC, Type E
Serial Number 88789075, Code NWOS, Type I
Serial Number 88789076, Code NWOS, Type I
Serial Number 88789077, Code NWOS, Type I
Serial Number 88789078, Code NWOS, Type I
Serial Number 88789079, Code NWOS, Type I
Serial Number 88789080, Code NWOS, Type I
Serial Number 88789081, Code NWOS, Type I
Serial Number 88789082, Code NWOS, Type I
Serial Number 88789083, Code NWOS, Type I
Serial Number 88789084, Code NWOS, Type I
Serial Number 88789085, Code NWOS, Type I
Serial Number 88789086, Code NWOS,

## Notes for Future Development

The XML tree of these files is quite complex due to the huge number of variables it contains. In order to make this a template notebook that can truly extract any of the variables with very simple input changes, it would require a lot of hard-coding of elements. For our purposes, I decided that it temporarily would not be worth the time to do so, but this can be made into a future feature.