# Brief walkthrough of the USPTO Open Data API Client

**Version**: Dec 10 2020


Reference: [USPTO Open Data API Client](https://docs.ip-tools.org/uspto-opendata-python/peds.html)

The focus here is on using the Patent Examination Data System, or PEDS. (Note: This API client also has PAIR Bulk Data [PBD] access, but PBD has been decommissioned.)

## Import the package and create a client

In [1]:
from uspto.peds.client import UsptoPatentExaminationDataSystemClient
client = UsptoPatentExaminationDataSystemClient()

## Let's see what options the client's `download_document` function has. 

In [2]:
help(client.download_document)

Help on method download_document in module uspto.util.client:

download_document(*args, **kwargs) method of uspto.peds.client.UsptoPatentExaminationDataSystemClient instance



## The `help()` function was not very helpful here. Let's try `help` on the client itself.

In [3]:
help(client)

Help on UsptoPatentExaminationDataSystemClient in module uspto.peds.client object:

class UsptoPatentExaminationDataSystemClient(uspto.util.client.UsptoGenericBulkDataClient)
 |  Python client for accessing the USPTO Patent Examination Data System API (https://ped.uspto.gov/peds/).
 |  See also: https://ped.uspto.gov/peds/#/apiDocumentation
 |  
 |  Method resolution order:
 |      UsptoPatentExaminationDataSystemClient
 |      uspto.util.client.UsptoGenericBulkDataClient
 |      builtins.object
 |  
 |  Data and other attributes defined here:
 |  
 |  DATASOURCE_NAME = 'peds'
 |  
 |  PACKAGE_DOWNLOAD_URL = 'https://ped.uspto.gov/api/queries/{query_id}/d...
 |  
 |  PACKAGE_REQUEST_URL = 'https://ped.uspto.gov/api/queries/{query_id}/pa...
 |  
 |  PACKAGE_STATUS_URL = 'https://ped.uspto.gov/api/queries/{query_id}?for...
 |  
 |  QUERY_URL = 'https://ped.uspto.gov/api/queries'
 |  
 |  document_factory = <class 'uspto.peds.document.UsptoPatentExaminationD...
 |  
 |  
 |  -------------

## The important methods seem to be inherited from the `UsptoGenericBulkDataClient` class. Let's check that one out.

In [4]:
import uspto.pbd.client 
client2 = uspto.util.client.UsptoGenericBulkDataClient()
help(client2)

Help on UsptoGenericBulkDataClient in module uspto.util.client object:

class UsptoGenericBulkDataClient(builtins.object)
 |  Methods defined here:
 |  
 |  __init__(self)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  check_package_url(self, query_id, format)
 |  
 |  download(self, query_id, format=None, progressbar=False)
 |  
 |  download_document(self, *args, **kwargs)
 |  
 |  download_package(self, query_id, format, progressbar=False)
 |  
 |  package_status(self, query_id, format)
 |  
 |  query(self, expression, filter=None, sort=None, start=None, rows=None, default_field=None)
 |  
 |  query_application(self, applicationId)
 |  
 |  query_patent(self, patentNumber)
 |  
 |  query_publication(self, appEarlyPubNumber)
 |  
 |  request_package(self, query_id, format)
 |  
 |  search(self, *args, **kwargs)
 |  
 |  unzip_package(self, payload_zip)
 |  
 |  wait_for_package(self, query_id, format)
 |  
 |  ------------------------------------------

## That also wasn't very helpful. What I ended up doing was looking directly at the `download_document` code. 

For me, it was located in my conda environment at: 
`/home/vlim/local/miniconda3/envs/patents/lib/python3.6/site-packages/uspto/util/client.py`

## Now let's get data for a specific U.S. patent application.

We'll look up application number: 09/418,640. Be sure to remove all punctuation marks before using it in the function.

In [5]:
# get application info for 09/418,640
result = client.download_document(type='application', number='09418640', format='json')

## The resulting data is stored in a dictionary.

This is true whether format is JSON or XML.

In [6]:
type(result)

dict

## Take a look at the data available in the result.

In [7]:
result

{'json': b'{ "PatentData" : [ {"patentCaseMetadata":{"applicationNumberText":{"value":"09418640","electronicText":"09418640"},"filingDate":"1999-10-15","applicationTypeCategory":"Utility","partyBag":{"applicantBagOrInventorBagOrOwnerBag":[{"primaryExaminerOrAssistantExaminerOrAuthorizedOfficer":[{"name":{"personNameOrOrganizationNameOrEntityName":[{"personFullName":"LACOURCIERE, KAREN A"}]}}]},{"inventorOrDeceasedInventor":[{"contactOrPublicationContact":[{"name":{"personNameOrOrganizationNameOrEntityName":[{"personStructuredName":{"firstName":"JENNIFER","middleName":"K.","lastName":"TAYLOR"}}]},"cityName":"SOLANA BEACH","geographicRegionName":{"value":"CA","geographicRegionCategory":"STATE"},"countryCode":"US"}]},{"contactOrPublicationContact":[{"name":{"personNameOrOrganizationNameOrEntityName":[{"personStructuredName":{"firstName":"LEX","middleName":"M.","lastName":"COWSERT"}}]},"cityName":"CARLSBAD","geographicRegionName":{"value":"CA","geographicRegionCategory":"STATE"},"countryCo

## For fun, let's also look at the XML data. It's less pretty than JSON.

In [8]:
# get application info for 09/418,640
result2 = client.download_document(type='application', number='09418640', format='xml')
result2

{'xml': b'<?xml version="1.0" encoding="UTF-8" ?>\n<uspat:PatentBulkData xsi:schemaLocation="urn:us:gov:doc:uspto:patent ../../main/resources/Schema/USPatent/Document/PatentData_V8_0.xsd" xmlns:pat="http://www.wipo.int/standards/XMLSchema/ST96/Patent" xmlns:uscom="urn:us:gov:doc:uspto:common" xmlns:uspat="urn:us:gov:doc:uspto:patent" xmlns:com="http://www.wipo.int/standards/XMLSchema/ST96/Common" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" com:st96Version="V3_1" com:ipoVersion="US_V8_0"><uspat:PatentData com:st96Version="V3_1" com:ipoVersion="US_V8_0" xsi:schemaLocation="urn:us:gov:doc:uspto:patent ../../main/resources/Schema/USPatent/Document/PatentBulkData_V8_0.xsd" xmlns:uspat="urn:us:gov:doc:uspto:patent" xmlns:tbl="http://www.oasis-open.org/tables/exchange/1.0" xmlns:pat="http://www.wipo.int/standards/XMLSchema/ST96/Patent" xmlns:uscom="urn:us:gov:doc:uspto:common" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:com="http://www.wipo.int/standards/XMLSchema/ST96/Common

## Let's pull out the continuity data from the JSON result.

In [9]:
type(result['json'])

bytes

## We need to convert the internal byte string into a dictionary before we can pull out data.

In [10]:
import json

byte_str = result['json']
mydata = json.loads(byte_str.decode('utf-8'))
type(mydata)

dict

In [11]:
mydata

{'PatentData': [{'patentCaseMetadata': {'applicationNumberText': {'value': '09418640',
     'electronicText': '09418640'},
    'filingDate': '1999-10-15',
    'applicationTypeCategory': 'Utility',
    'partyBag': {'applicantBagOrInventorBagOrOwnerBag': [{'primaryExaminerOrAssistantExaminerOrAuthorizedOfficer': [{'name': {'personNameOrOrganizationNameOrEntityName': [{'personFullName': 'LACOURCIERE, KAREN A'}]}}]},
      {'inventorOrDeceasedInventor': [{'contactOrPublicationContact': [{'name': {'personNameOrOrganizationNameOrEntityName': [{'personStructuredName': {'firstName': 'JENNIFER',
               'middleName': 'K.',
               'lastName': 'TAYLOR'}}]},
           'cityName': 'SOLANA BEACH',
           'geographicRegionName': {'value': 'CA',
            'geographicRegionCategory': 'STATE'},
           'countryCode': 'US'}]},
        {'contactOrPublicationContact': [{'name': {'personNameOrOrganizationNameOrEntityName': [{'personStructuredName': {'firstName': 'LEX',
             

## Now we can pull out the related documents information.

In [12]:
mydata['PatentData'][0]['patentCaseMetadata']['relatedDocumentData']['parentDocumentDataOrChildDocumentData']

[{'descriptionText': 'which is - claims the benefit of 09418640',
  'applicationNumberText': 'PCT/US00/27963',
  'filingDate': '2000-10-11',
  'childDocumentStatusCode': '-',
  'patentNumber': ''}]

## If we compare this to the results on the [PEDS search page](https://ped.uspto.gov/peds/#/search) we see that it matches.

This application has no parent continuity data found. It does have child continuity data, which is a PCT application (app. no. PCT/US00/27963) claiming the benefit of this one.