## Process REST Payload using Collection Operations

Let us understand how to process REST Payload using Collection Operations.
* We can get details about all the public repositories using `GET /repositories` from **https://api.github.com**.
* As it is getting or reading data from external application the details are available via `GET`. We will have JSON Array as part of the Payload.
* We can convert this JSON Array to Python `list`. Each element in the list will be of type `dict`.
* Let us understand how the data in this list of dicts can be processed using Python core collection operations.

* Let us quickly review the output of the REST API using `curl`.

In [None]:
!curl https://api.github.com/repositories

* We can get the payload of public repositories using `requests.get`.
* We can convert to Python list using `json()`.

In [2]:
import requests

* We can convert `payload` which is of string type and contains valid JSON to `dict` or `list` using `json` module.

In [6]:
payload = requests.get('https://api.github.com/repositories', params={'since':369}).json()

In [None]:
payload

In [8]:
since = int(input('Enter the repo id from which you want to get repositories: '))

Enter the repo id from which you want to get repositories:  369


In [10]:
repos = requests.get(f'https://api.github.com/repositories?since={since}').json()

In [11]:
type(repos)

list

In [None]:
repos # A string with valid json array converted to list of dicts

In [13]:
len(repos)

100

In [14]:
repos[0]

{'id': 370,
 'node_id': 'MDEwOlJlcG9zaXRvcnkzNzA=',
 'name': 'imap_authenticatable',
 'full_name': 'collectiveidea/imap_authenticatable',
 'private': False,
 'owner': {'login': 'collectiveidea',
  'id': 128,
  'node_id': 'MDEyOk9yZ2FuaXphdGlvbjEyOA==',
  'avatar_url': 'https://avatars.githubusercontent.com/u/128?v=4',
  'gravatar_id': '',
  'url': 'https://api.github.com/users/collectiveidea',
  'html_url': 'https://github.com/collectiveidea',
  'followers_url': 'https://api.github.com/users/collectiveidea/followers',
  'following_url': 'https://api.github.com/users/collectiveidea/following{/other_user}',
  'gists_url': 'https://api.github.com/users/collectiveidea/gists{/gist_id}',
  'starred_url': 'https://api.github.com/users/collectiveidea/starred{/owner}{/repo}',
  'subscriptions_url': 'https://api.github.com/users/collectiveidea/subscriptions',
  'organizations_url': 'https://api.github.com/users/collectiveidea/orgs',
  'repos_url': 'https://api.github.com/users/collectiveidea/rep

In [15]:
type(repos[0])

dict

* We can process the data further using appropriate Python modules based upon the requiements.

In [16]:
for repo in repos:
    print(repo['id'])

370
371
372
374
376
377
379
386
388
408
410
413
422
423
425
426
427
429
430
443
469
483
491
492
494
506
507
509
510
511
513
514
520
521
523
531
533
537
538
539
541
543
547
550
556
559
571
586
590
592
597
600
603
608
619
620
622
623
624
625
629
639
641
648
649
653
654
660
662
663
664
682
690
703
713
726
732
733
735
738
751
765
774
780
786
788
800
807
818
828
830
838
839
852
856
866
867
872
876
882


In [17]:
for repo in repos:
    print(repo['name'])

imap_authenticatable
random_finders
with_action
graticule
tinder
invisible
pyprofile
rush
ike
halcyon
cruisecontrolrb
opml-schema
reddy
youtube-g
facebox
haml
kissgen
exception_logger
brain_buster
vanhelsing
linthicum
textilizefu
slate
archangel
god
newjs
twitter
googlebase
googlereader
mirrored
scrobbler
lorem
rails-authorization-plugin
drnic_js_test_helpers
mephisto-erb-templates-plugin
imdb
userstamp
matzbot
pci4r
exposure
packet
elderbrowser
git-wiki
merb-core
jim
blankable
merb-for-rails
portmidi-ruby
vjot
errcount
jquery-autocomplete
finally
alfred
github-campfire
merb-more
merb-core
blanket
merb-more
rubyports
wordcram
dumbapp
merl
squawk-micro
io
fuzed-old
comment_replies
tumblr
codesnippets
barby
sin
jsunittest
delayed_job
admin_tasks
chronic
ruby-satisfaction
llor-nu-legacy
rubyurl
rubyurl
mor7-google-charts-demo
dm
eldorado
stringex
freefall
git-wiki
kroonikko
bus-scheme
bindata
bookqueue
strokedb
bus-scheme
yark
capistrano-bells
calendar-maker
mydry
gemify
mor7
blog
hairbal

In [18]:
# Getting repo name and urls
for repo in repos:
    print(f"{repo['name']}:{repo['url']}")

imap_authenticatable:https://api.github.com/repos/collectiveidea/imap_authenticatable
random_finders:https://api.github.com/repos/collectiveidea/random_finders
with_action:https://api.github.com/repos/collectiveidea/with_action
graticule:https://api.github.com/repos/collectiveidea/graticule
tinder:https://api.github.com/repos/collectiveidea/tinder
invisible:https://api.github.com/repos/macournoyer/invisible
pyprofile:https://api.github.com/repos/tommorris/pyprofile
rush:https://api.github.com/repos/adamwiggins/rush
ike:https://api.github.com/repos/defunkt/ike
halcyon:https://api.github.com/repos/mtodd/halcyon
cruisecontrolrb:https://api.github.com/repos/benburkert/cruisecontrolrb
opml-schema:https://api.github.com/repos/tommorris/opml-schema
reddy:https://api.github.com/repos/tommorris/reddy
youtube-g:https://api.github.com/repos/shane/youtube-g
facebox:https://api.github.com/repos/defunkt/facebox
haml:https://api.github.com/repos/haml/haml
kissgen:https://api.github.com/repos/lancecar

In [19]:
repo_urls = [{'name': repo['name'], 'repo_url': repo['url']} for repo in repos]

In [21]:
repo_urls[0]

{'name': 'imap_authenticatable',
 'repo_url': 'https://api.github.com/repos/collectiveidea/imap_authenticatable'}

In [23]:
repo_urls = list(map(lambda repo: {'name': repo['name'], 'repo_url': repo['url']}, repos))

In [24]:
repo_urls[0]

{'name': 'imap_authenticatable',
 'repo_url': 'https://api.github.com/repos/collectiveidea/imap_authenticatable'}

Here are some of the tasks you can work on using `repos` data. We will explore the solutions using functions such as `map`, `filter`, `itertools.groupby`, etc.

In [25]:
repos = requests.get(f'https://api.github.com/repositories?since={since}').json()

* Get number of repositories.

In [26]:
len(repos)

100

* Get repository name, url and owner type of all repositories. Each element in the new list should be of type **tuple**.

In [27]:
repo = repos[0]

In [28]:
repo

{'id': 370,
 'node_id': 'MDEwOlJlcG9zaXRvcnkzNzA=',
 'name': 'imap_authenticatable',
 'full_name': 'collectiveidea/imap_authenticatable',
 'private': False,
 'owner': {'login': 'collectiveidea',
  'id': 128,
  'node_id': 'MDEyOk9yZ2FuaXphdGlvbjEyOA==',
  'avatar_url': 'https://avatars.githubusercontent.com/u/128?v=4',
  'gravatar_id': '',
  'url': 'https://api.github.com/users/collectiveidea',
  'html_url': 'https://github.com/collectiveidea',
  'followers_url': 'https://api.github.com/users/collectiveidea/followers',
  'following_url': 'https://api.github.com/users/collectiveidea/following{/other_user}',
  'gists_url': 'https://api.github.com/users/collectiveidea/gists{/gist_id}',
  'starred_url': 'https://api.github.com/users/collectiveidea/starred{/owner}{/repo}',
  'subscriptions_url': 'https://api.github.com/users/collectiveidea/subscriptions',
  'organizations_url': 'https://api.github.com/users/collectiveidea/orgs',
  'repos_url': 'https://api.github.com/users/collectiveidea/rep

In [32]:
repo

{'id': 370,
 'node_id': 'MDEwOlJlcG9zaXRvcnkzNzA=',
 'name': 'imap_authenticatable',
 'full_name': 'collectiveidea/imap_authenticatable',
 'private': False,
 'owner': {'login': 'collectiveidea',
  'id': 128,
  'node_id': 'MDEyOk9yZ2FuaXphdGlvbjEyOA==',
  'avatar_url': 'https://avatars.githubusercontent.com/u/128?v=4',
  'gravatar_id': '',
  'url': 'https://api.github.com/users/collectiveidea',
  'html_url': 'https://github.com/collectiveidea',
  'followers_url': 'https://api.github.com/users/collectiveidea/followers',
  'following_url': 'https://api.github.com/users/collectiveidea/following{/other_user}',
  'gists_url': 'https://api.github.com/users/collectiveidea/gists{/gist_id}',
  'starred_url': 'https://api.github.com/users/collectiveidea/starred{/owner}{/repo}',
  'subscriptions_url': 'https://api.github.com/users/collectiveidea/subscriptions',
  'organizations_url': 'https://api.github.com/users/collectiveidea/orgs',
  'repos_url': 'https://api.github.com/users/collectiveidea/rep

In [29]:
repo['name']

'imap_authenticatable'

In [30]:
repo['url']

'https://api.github.com/repos/collectiveidea/imap_authenticatable'

In [31]:
repo['owner']['type']

'Organization'

In [33]:
list(map(lambda repo: (repo['name'], repo['url'], repo['owner']['type']), repos))

[('imap_authenticatable',
  'https://api.github.com/repos/collectiveidea/imap_authenticatable',
  'Organization'),
 ('random_finders',
  'https://api.github.com/repos/collectiveidea/random_finders',
  'Organization'),
 ('with_action',
  'https://api.github.com/repos/collectiveidea/with_action',
  'Organization'),
 ('graticule',
  'https://api.github.com/repos/collectiveidea/graticule',
  'Organization'),
 ('tinder',
  'https://api.github.com/repos/collectiveidea/tinder',
  'Organization'),
 ('invisible', 'https://api.github.com/repos/macournoyer/invisible', 'User'),
 ('pyprofile', 'https://api.github.com/repos/tommorris/pyprofile', 'User'),
 ('rush', 'https://api.github.com/repos/adamwiggins/rush', 'User'),
 ('ike', 'https://api.github.com/repos/defunkt/ike', 'User'),
 ('halcyon', 'https://api.github.com/repos/mtodd/halcyon', 'User'),
 ('cruisecontrolrb',
  'https://api.github.com/repos/benburkert/cruisecontrolrb',
  'User'),
 ('opml-schema', 'https://api.github.com/repos/tommorris/opm

* Get all unique or distinct owner types of the repositories. The output should be of type **list**.

In [34]:
list(map(lambda repo: repo['owner']['type'], repos))

['Organization',
 'Organization',
 'Organization',
 'Organization',
 'Organization',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'Organization',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'Organization',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User']

In [35]:
set(map(lambda repo: repo['owner']['type'], repos))

{'Organization', 'User'}

In [36]:
list(set(map(lambda repo: repo['owner']['type'], repos)))

['Organization', 'User']

* Get number of repositories where owner type is **User**.

In [37]:
repo['owner']['type'] == 'User'

False

In [None]:
list(filter(lambda repo: repo['owner']['type'] == 'User', repos))

In [39]:
len(list(filter(lambda repo: repo['owner']['type'] == 'User', repos)))

93

* Get number of repositories where owner type is **Organization**.

In [40]:
len(list(filter(lambda repo: repo['owner']['type'] == 'Organization', repos)))

7

* Get number of repositories by each owner type.

In [41]:
import itertools as iter

In [42]:
list(map(lambda repo: repo['owner']['type'], repos))

['Organization',
 'Organization',
 'Organization',
 'Organization',
 'Organization',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'Organization',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'Organization',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User']

In [43]:
sorted(map(lambda repo: repo['owner']['type'], repos))

['Organization',
 'Organization',
 'Organization',
 'Organization',
 'Organization',
 'Organization',
 'Organization',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User']

In [44]:
repo_types = sorted(map(lambda repo: repo['owner']['type'], repos))

In [45]:
repo_types

['Organization',
 'Organization',
 'Organization',
 'Organization',
 'Organization',
 'Organization',
 'Organization',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User',
 'User']

In [46]:
iter.groupby(repo_types)

<itertools.groupby at 0x7f758d5eff48>

In [47]:
for item in iter.groupby(repo_types):
    print((item[0], list(item[1])))

('Organization', ['Organization', 'Organization', 'Organization', 'Organization', 'Organization', 'Organization', 'Organization'])
('User', ['User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User'])


In [48]:
list(map(lambda item: (item[0], len(list(item[1]))), iter.groupby(repo_types)))

[('Organization', 7), ('User', 93)]

* Sort the data by owner type and then by id. Ensure that data is sorted by id as numeric.

In [49]:
repo

{'id': 370,
 'node_id': 'MDEwOlJlcG9zaXRvcnkzNzA=',
 'name': 'imap_authenticatable',
 'full_name': 'collectiveidea/imap_authenticatable',
 'private': False,
 'owner': {'login': 'collectiveidea',
  'id': 128,
  'node_id': 'MDEyOk9yZ2FuaXphdGlvbjEyOA==',
  'avatar_url': 'https://avatars.githubusercontent.com/u/128?v=4',
  'gravatar_id': '',
  'url': 'https://api.github.com/users/collectiveidea',
  'html_url': 'https://github.com/collectiveidea',
  'followers_url': 'https://api.github.com/users/collectiveidea/followers',
  'following_url': 'https://api.github.com/users/collectiveidea/following{/other_user}',
  'gists_url': 'https://api.github.com/users/collectiveidea/gists{/gist_id}',
  'starred_url': 'https://api.github.com/users/collectiveidea/starred{/owner}{/repo}',
  'subscriptions_url': 'https://api.github.com/users/collectiveidea/subscriptions',
  'organizations_url': 'https://api.github.com/users/collectiveidea/orgs',
  'repos_url': 'https://api.github.com/users/collectiveidea/rep

In [50]:
type(repo['id'])

int

In [51]:
sorted(repos, key=lambda repo: (repo['owner']['type'], repo['id']))

[{'id': 370,
  'node_id': 'MDEwOlJlcG9zaXRvcnkzNzA=',
  'name': 'imap_authenticatable',
  'full_name': 'collectiveidea/imap_authenticatable',
  'private': False,
  'owner': {'login': 'collectiveidea',
   'id': 128,
   'node_id': 'MDEyOk9yZ2FuaXphdGlvbjEyOA==',
   'avatar_url': 'https://avatars.githubusercontent.com/u/128?v=4',
   'gravatar_id': '',
   'url': 'https://api.github.com/users/collectiveidea',
   'html_url': 'https://github.com/collectiveidea',
   'followers_url': 'https://api.github.com/users/collectiveidea/followers',
   'following_url': 'https://api.github.com/users/collectiveidea/following{/other_user}',
   'gists_url': 'https://api.github.com/users/collectiveidea/gists{/gist_id}',
   'starred_url': 'https://api.github.com/users/collectiveidea/starred{/owner}{/repo}',
   'subscriptions_url': 'https://api.github.com/users/collectiveidea/subscriptions',
   'organizations_url': 'https://api.github.com/users/collectiveidea/orgs',
   'repos_url': 'https://api.github.com/users