# Calling CN.listObjects

This script demonstrates interacting with the [CN.listObjects](https://purl.dataone.org/architecture/apis/CN_APIs.html#CNRead.listObjects) method using the python client.

In [1]:
# Import the library and create a client instance

from d1_client import baseclient_2_0

cn_base_url = "https://cn.dataone.org/cn"
client = baseclient_2_0.DataONEBaseClient_2_0(cn_base_url)

Specify that only five (5) results are to be returned in the request, and start from the first entry. Then call the `listObjects` method.

In [2]:
num_to_retrieve = 5
starting_index = 0
params = {'count': num_to_retrieve,
          'start': starting_index
         }
response = client.listObjects( **params )

Show the response, printing out each entry.

In [3]:
DATE_FORMAT = "%Y-%m-%dT%H:%M:%SZ"
from datetime import datetime as dt

def printResults(response):
    print("Total objects: {0} Start: {1}  Page size: {2}\n".format(response.total, response.start, response.count))
    counter = response.start
    for entry in response.objectInfo:
        print(u"{:08d}: ".format(counter))
        print(u"            PID: {0}".format(entry.identifier.value()))
        print(u"       formatId: {0}".format(entry.formatId))
        print(u"           size: {0}".format(entry.size))
        print(u"  date_modified: {0}".format(entry.dateSysMetadataModified.strftime(DATE_FORMAT)))
        print("")
        counter += 1

printResults(response)

Total objects: 2889561 Start: 0  Page size: 5

00000000: 
            PID: 0000120ce277dbb2e140d74b50ca23e5
       formatId: http://www.isotc211.org/2005/gmd-pangaea
           size: 19541
  date_modified: 2018-04-20T07:59:45Z

00000001: 
            PID: 000026213216f47287f0d3027f3c4be3
       formatId: http://www.isotc211.org/2005/gmd-pangaea
           size: 26256
  date_modified: 2018-04-20T05:09:36Z

00000002: 
            PID: 0000aa6924377b6a7e5ab59bcec5d4f3
       formatId: http://www.isotc211.org/2005/gmd-pangaea
           size: 35084
  date_modified: 2018-02-17T03:01:16Z

00000003: 
            PID: 0000d11ff42b22915fcce5cfa6027040
       formatId: http://www.isotc211.org/2005/gmd-pangaea
           size: 35257
  date_modified: 2018-01-06T10:43:32Z

00000004: 
            PID: 0000eb4ff1fc59ae6c33a4981e00eabf
       formatId: http://www.isotc211.org/2005/gmd-pangaea
           size: 49904
  date_modified: 2018-01-08T11:18:27Z



Add a `fromDate` parameter, so `listObjects` will respond with the list of entries that were modified between one day ago and now.

In [4]:
import dateparser

start_date = dateparser.parse('yesterday UTC', 
                              settings={'RETURN_AS_TIMEZONE_AWARE': True})

params = {'count': num_to_retrieve,
          'start': starting_index,
          'fromDate': start_date,
         }
print( str(params) )
response = client.listObjects( **params )

printResults( response )


{'count': 5, 'start': 0, 'fromDate': datetime.datetime(2020, 7, 28, 16, 37, 50, 53057, tzinfo=<StaticTzInfo 'UTC'>)}
Total objects: 137 Start: 0  Page size: 5

00000000: 
            PID: 10.24431_rw1k464_2020_7_28_174358
       formatId: http://www.isotc211.org/2005/gmd
           size: 26795
  date_modified: 2020-07-28T17:44:00Z

00000001: 
            PID: 1c52f167-4dcc-4dd5-aaf6-15f508611cbf
       formatId: text/csv
           size: 87443
  date_modified: 2020-07-28T17:43:11Z

00000002: 
            PID: 1fe7e489-fce8-43ab-b0c4-c8e17375e2d2
       formatId: text/csv
           size: 28489
  date_modified: 2020-07-28T17:43:12Z

00000003: 
            PID: 3e775ef8-a927-4a16-96fb-d3acbeffd9c9
       formatId: text/csv
           size: 41591
  date_modified: 2020-07-28T17:43:17Z

00000004: 
            PID: 4dc88670-ea8b-4dc2-b8a4-edaedcfc1b79
       formatId: text/csv
           size: 112282
  date_modified: 2020-07-28T17:43:18Z



The server will limit the total number of records returned. When requesting large sets of entries, the 
response will need to be examined to determine if additional pages of results should be requested.

In [5]:
start_date = dateparser.parse('two weeks ago UTC', 
                              settings={'RETURN_AS_TIMEZONE_AWARE': True})
end_date = dateparser.parse('one week ago UTC', 
                              settings={'RETURN_AS_TIMEZONE_AWARE': True})
max_to_retrieve = 25  # limit total numbe of entries to download

params = {'start': 0,  
          'count': 3,      #specify a small page size
          'fromDate': start_date,
          'toDate': end_date}
response = client.listObjects( **params )

if max_to_retrieve > response.total:
    max_to_retrieve = response.total

printResults( response )

num_retrieved = response.count
while num_retrieved < max_to_retrieve:
    params['start'] += response.count
    response = client.listObjects( **params )
    num_retrieved += response.count
    printResults( response )
    

Total objects: 2254 Start: 0  Page size: 3

00000000: 
            PID: 10016bde-b180-4e4a-b0c2-a912084bff9c
       formatId: application/octet-stream
           size: 58198749
  date_modified: 2020-07-21T19:35:14Z

00000001: 
            PID: 10.24431_rw1k43t_20203421026
       formatId: http://www.isotc211.org/2005/gmd
           size: 50059
  date_modified: 2020-07-20T18:50:48Z

00000002: 
            PID: 10.24431_rw1k43t_2020_7_20_185014
       formatId: http://www.isotc211.org/2005/gmd
           size: 48471
  date_modified: 2020-07-20T18:50:16Z

Total objects: 2254 Start: 3  Page size: 3

00000003: 
            PID: 10.24431_rw1k45x_2020_7_20_202245
       formatId: http://www.isotc211.org/2005/gmd
           size: 41883
  date_modified: 2020-07-20T20:22:46Z

00000004: 
            PID: 1072e855-aba2-40a9-9a22-30974ec790e1
       formatId: application/octet-stream
           size: 98762383
  date_modified: 2020-07-21T18:50:43Z

00000005: 
            PID: 184ef9f6-5ae0-49fc-9f90