# Calling CN.listObjects

This script demonstrates interacting with the [CN.listObjects](https://purl.dataone.org/architecture/apis/CN_APIs.html#CNRead.listObjects) method using the python client.

In [1]:
# Import the library and create a client instance

from d1_client import baseclient_2_0

cn_base_url = "https://cn.dataone.org/cn"
client = baseclient_2_0.DataONEBaseClient_2_0(cn_base_url)

Specify that only five (5) results are to be returned in the request, and start from the first entry. Then call the `listObjects` method.

In [2]:
num_to_retrieve = 5
starting_index = 0
params = {'count': num_to_retrieve,
          'start': starting_index
         }
response = client.listObjects( **params )

Show the response, printing out each entry.

In [3]:
DATE_FORMAT = "%Y-%m-%dT%H:%M:%SZ"
from datetime import datetime as dt

def printResults(response):
    print("Total objects: {0} Start: {1}  Page size: {2}\n".format(response.total, response.start, response.count))
    counter = response.start
    for entry in response.objectInfo:
        print(u"{:08d}: ".format(counter))
        print(u"            PID: {0}".format(entry.identifier.value()))
        print(u"       formatId: {0}".format(entry.formatId))
        print(u"           size: {0}".format(entry.size))
        print(u"  date_modified: {0}".format(entry.dateSysMetadataModified.strftime(DATE_FORMAT)))
        print("")
        counter += 1

printResults(response)

Total objects: 2895254 Start: 0  Page size: 5

00000000: 
            PID: 0000120ce277dbb2e140d74b50ca23e5
       formatId: http://www.isotc211.org/2005/gmd-pangaea
           size: 19541
  date_modified: 2018-04-20T07:59:45Z

00000001: 
            PID: 000026213216f47287f0d3027f3c4be3
       formatId: http://www.isotc211.org/2005/gmd-pangaea
           size: 26256
  date_modified: 2018-04-20T05:09:36Z

00000002: 
            PID: 0000aa6924377b6a7e5ab59bcec5d4f3
       formatId: http://www.isotc211.org/2005/gmd-pangaea
           size: 35084
  date_modified: 2018-02-17T03:01:16Z

00000003: 
            PID: 0000d11ff42b22915fcce5cfa6027040
       formatId: http://www.isotc211.org/2005/gmd-pangaea
           size: 35257
  date_modified: 2018-01-06T10:43:32Z

00000004: 
            PID: 0000eb4ff1fc59ae6c33a4981e00eabf
       formatId: http://www.isotc211.org/2005/gmd-pangaea
           size: 49904
  date_modified: 2018-01-08T11:18:27Z



Add a `fromDate` parameter, so `listObjects` will respond with the list of entries that were modified between one day ago and now.

In [4]:
import dateparser

start_date = dateparser.parse('yesterday UTC', 
                              settings={'RETURN_AS_TIMEZONE_AWARE': True})

params = {'count': num_to_retrieve,
          'start': starting_index,
          'fromDate': start_date,
         }
print( str(params) )
response = client.listObjects( **params )

printResults( response )


{'count': 5, 'start': 0, 'fromDate': datetime.datetime(2020, 8, 5, 13, 8, 50, 339714, tzinfo=<StaticTzInfo 'UTC'>)}
Total objects: 186 Start: 0  Page size: 5

00000000: 
            PID: 05aaa4b0-fcd4-4c87-b0e1-84fb9d55ce21
       formatId: text/plain
           size: 1045
  date_modified: 2020-08-05T20:17:23Z

00000001: 
            PID: 09b8c6b5-d95d-46ee-9b27-4e884dd2222b
       formatId: http://www.openarchives.org/ore/terms
           size: 5440
  date_modified: 2020-08-05T20:17:52Z

00000002: 
            PID: 10.24431_rw1k46a_2020_8_5_20657
       formatId: http://www.isotc211.org/2005/gmd
           size: 52320
  date_modified: 2020-08-05T20:06:58Z

00000003: 
            PID: 10.24431_rw1k46b_2020_8_5_201725
       formatId: http://www.isotc211.org/2005/gmd
           size: 36733
  date_modified: 2020-08-05T20:17:25Z

00000004: 
            PID: 3d2f9ab9-a159-482c-a797-3f5e603d41df
       formatId: application/zip
           size: 55294222
  date_modified: 2020-08-05T20:17:18Z

The server will limit the total number of records returned. When requesting large sets of entries, the 
response will need to be examined to determine if additional pages of results should be requested.

In [5]:
start_date = dateparser.parse('two weeks ago UTC', 
                              settings={'RETURN_AS_TIMEZONE_AWARE': True})
end_date = dateparser.parse('one week ago UTC', 
                              settings={'RETURN_AS_TIMEZONE_AWARE': True})
max_to_retrieve = 25  # limit total numbe of entries to download

params = {'start': 0,  
          'count': 3,      #specify a small page size
          'fromDate': start_date,
          'toDate': end_date}
response = client.listObjects( **params )

if max_to_retrieve > response.total:
    max_to_retrieve = response.total

printResults( response )

num_retrieved = response.count
while num_retrieved < max_to_retrieve:
    params['start'] += response.count
    response = client.listObjects( **params )
    num_retrieved += response.count
    printResults( response )
    

Total objects: 1014 Start: 0  Page size: 3

00000000: 
            PID: 001d01a92d70ea24b4ab7e81c29858b9
       formatId: http://www.isotc211.org/2005/gmd
           size: 15674
  date_modified: 2020-07-29T19:51:44Z

00000001: 
            PID: 03df7e14aac0a44e89fd88a8e14ec794
       formatId: http://www.isotc211.org/2005/gmd
           size: 20263
  date_modified: 2020-07-29T19:50:11Z

00000002: 
            PID: 07b2d06230bef0b718d3dc3735878510
       formatId: http://www.isotc211.org/2005/gmd
           size: 19867
  date_modified: 2020-07-29T19:49:23Z

Total objects: 1014 Start: 3  Page size: 3

00000003: 
            PID: 07c91bc1-583c-4c20-9494-5764ebfe884b
       formatId: text/csv
           size: 45977422
  date_modified: 2020-07-27T22:12:05Z

00000004: 
            PID: 0b613da7-1fd7-42da-a8a1-6a723fc67c17
       formatId: text/csv
           size: 9209
  date_modified: 2020-07-27T17:51:26Z

00000005: 
            PID: 0c7974cb-58d0-4dff-9775-ec47d6f180bd
       formatId: tex