## Process REST Payload using Collection Operations

Let us understand how to process REST Payload using Collection Operations.
* We can get details about all the public repositories using `GET /repositories` from **https://api.github.com**.
* As it is getting or reading data from external application the details are available via `GET`. We will have JSON Array as part of the Payload.
* We can convert this JSON Array to Python `list`. Each element in the list will be of type `dict`.
* Let us understand how the data in this list of dicts can be processed using Python core collection operations.

* Let us quickly review the output of the REST API using `curl`.

In [None]:
!curl https://api.github.com/repositories

* We can get the payload of public repositories using `requests.get`.
* We can convert to Python list using `json()`.

In [21]:
import requests

* We can convert `payload` which is of string type and contains valid JSON to `dict` or `list` using `json` module.

In [4]:
payload = requests.get('https://api.github.com/repositories', params={'since':369}).json()

In [None]:
payload

In [6]:
since = int(input('Enter the repo id from which you want to get repositories: '))

Enter the repo id from which you want to get repositories:  369


In [7]:
repos = requests.get(f'https://api.github.com/repositories?since={since}').json()

In [8]:
type(repos)

list

In [None]:
repos # A string with valid json array converted to list of dicts

In [10]:
len(repos)

100

In [None]:
repos[0]

In [12]:
type(repos[0])

dict

* We can process the data further using appropriate Python modules based upon the requiements.

In [None]:
for repo in repos:
    print(repo['id'])

In [None]:
for repo in repos:
    print(repo['name'])

In [None]:
# Getting repo name and urls
for repo in repos:
    print(f"{repo['name']}:{repo['url']}")

In [15]:
repo_urls = [{'name': repo['name'], 'repo_url': repo['url']} for repo in repos]

In [16]:
repo_urls[0]

{'name': 'imap_authenticatable',
 'repo_url': 'https://api.github.com/repos/collectiveidea/imap_authenticatable'}

In [18]:
repo_urls = list(map(lambda repo: {'name': repo['name'], 'repo_url': repo['url']}, repos))

In [19]:
repo_urls[0]

{'name': 'imap_authenticatable',
 'repo_url': 'https://api.github.com/repos/collectiveidea/imap_authenticatable'}

Here are some of the tasks you can work on using `repos` data. We will explore the solutions using functions such as `map`, `filter`, `itertools.groupby`, etc.

In [20]:
repos = requests.get(f'https://api.github.com/repositories?since={since}').json()

* Get number of repositories.

In [21]:
len(repos)

100

* Get repository name, url and owner type of all repositories. Each element in the new list should be of type **tuple**.

In [13]:
repo = repos[0]

In [None]:
repo

In [15]:
repo['name']

'imap_authenticatable'

In [16]:
repo['url']

'https://api.github.com/repos/collectiveidea/imap_authenticatable'

In [17]:
repo['owner']['type']

'Organization'

In [None]:
list(map(lambda repo: (repo['name'], repo['url'], repo['owner']['type']), repos))

* Get all unique or distinct owner types of the repositories. The output should be of type **list**.

In [None]:
list(map(lambda repo: repo['owner']['type'], repos))

In [29]:
set(map(lambda repo: repo['owner']['type'], repos))

{'Organization', 'User'}

In [30]:
list(set(map(lambda repo: repo['owner']['type'], repos)))

['User', 'Organization']

* Get number of repositories where owner type is **User**.

In [31]:
repo['owner']['type'] == 'User'

False

In [None]:
list(filter(lambda repo: repo['owner']['type'] == 'User', repos))

In [33]:
len(list(filter(lambda repo: repo['owner']['type'] == 'User', repos)))

93

* Get number of repositories where owner type is **Organization**.

In [34]:
len(list(filter(lambda repo: repo['owner']['type'] == 'Organization', repos)))

7

* Get number of repositories by each owner type.

In [22]:
import itertools as iter

In [None]:
list(map(lambda repo: repo['owner']['type'], repos))

In [None]:
sorted(map(lambda repo: repo['owner']['type'], repos))

In [25]:
repo_types = sorted(map(lambda repo: repo['owner']['type'], repos))

In [None]:
repo_types

In [91]:
iter.groupby(repo_types)

<itertools.groupby at 0x7f59e5743548>

In [92]:
for item in iter.groupby(repo_types):
    print((item[0], list(item[1])))

('Organization', ['Organization', 'Organization', 'Organization', 'Organization', 'Organization', 'Organization', 'Organization'])
('User', ['User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User', 'User'])
('wsooubbcxxxcfgggvof', ['wsooubbcxxxcfgggvof', 'wsooubbcxxxcfgggvof', 'wsooubbcxxxcfgggvof', 'wsooubbcxxxcfgggvof

In [42]:
list(map(lambda item: (item[0], len(list(item[1]))), iter.groupby(repo_types)))

[('Organization', 7), ('User', 93)]

* Sort the data by owner type and then by id. Ensure that data is sorted by id as numeric.

In [43]:
repo

{'id': 370,
 'node_id': 'MDEwOlJlcG9zaXRvcnkzNzA=',
 'name': 'imap_authenticatable',
 'full_name': 'collectiveidea/imap_authenticatable',
 'private': False,
 'owner': {'login': 'collectiveidea',
  'id': 128,
  'node_id': 'MDEyOk9yZ2FuaXphdGlvbjEyOA==',
  'avatar_url': 'https://avatars.githubusercontent.com/u/128?v=4',
  'gravatar_id': '',
  'url': 'https://api.github.com/users/collectiveidea',
  'html_url': 'https://github.com/collectiveidea',
  'followers_url': 'https://api.github.com/users/collectiveidea/followers',
  'following_url': 'https://api.github.com/users/collectiveidea/following{/other_user}',
  'gists_url': 'https://api.github.com/users/collectiveidea/gists{/gist_id}',
  'starred_url': 'https://api.github.com/users/collectiveidea/starred{/owner}{/repo}',
  'subscriptions_url': 'https://api.github.com/users/collectiveidea/subscriptions',
  'organizations_url': 'https://api.github.com/users/collectiveidea/orgs',
  'repos_url': 'https://api.github.com/users/collectiveidea/rep

In [44]:
type(repo['id'])

int

In [None]:
sorted(repos, key=lambda repo: (repo['owner']['type'], repo['id']))