## Create Dynamodb Tables

Let us create tables for both storing GitHub Repo data as well as the marker or bookmark. Marker or Bookmark will be used to invoke the API and get the data in incremental fashion.
* Create table called as `ghrepos` for following fields from API output.
  * id
  * node_id
  * name
  * full_name
  * owner.login
  * owner.id
  * owner.node_id
  * owner.type
  * owner.site_admin
  * html_url
  * description
  * fork
  * created_at
* Create table called as `ghmarker`. It will only contain one record with 3 columns.
  * tn (table name - ghrepos)
  * marker (last id from each list all repos call). We will store it as string as we can use it for other API calls to populate other tables.
  * status (success or failed)
* <font color="red"> As Dynamodb is NoSQL database, we cannot specify the column names while creating the tables. We specify the column names along with data while loading data into the table.
* We can use `ghmarker` for other similar scenarios (invoke APIs incrementally and populate the table). 

### 1. Test new IAM policy with DynamoDB

In [34]:
%%sh

aws dynamodb list-tables --profile itvgithub 

{
    "TableNames": []
}


### 2. Use Boto3 with DynamoDB

In [35]:
import boto3

In [36]:
import os

In [37]:
os.environ.setdefault('AWS_PROFILE', 'itvgithub')

'itvgithub'

In [15]:
# os.environ.setdefault('AWS_DEFAULT_REGION', 'ap-southeast-1')

In [38]:
dynamodb = boto3.resource('dynamodb', region_name="ap-southeast-1")

In [39]:
dynamodb.create_table?

[0;31mSignature:[0m [0mdynamodb[0m[0;34m.[0m[0mcreate_table[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
The ``CreateTable`` operation adds a new table to your account. In an Amazon Web Services account, table names must be unique within each Region. That is, you can have two tables with same name if you create the tables in different Regions.

 

 ``CreateTable`` is an asynchronous operation. Upon receiving a ``CreateTable`` request, DynamoDB immediately returns a response with a ``TableStatus`` of ``CREATING``. After the table is created, DynamoDB sets the ``TableStatus`` to ``ACTIVE``. You can perform read and write operations only on an ``ACTIVE`` table.

 

You can optionally define secondary indexes on the new table, as part of the ``CreateTable`` operation. If you want to create multiple tables with secondary indexes on them, you must create the tables sequentially. Only one tab

### 3. Create Marker table : ghmarker

In [40]:
ghmarker = dynamodb.create_table(
      TableName='ghmarker',
      KeySchema=[
          {
              'AttributeName': 'tn',
              'KeyType': 'HASH' # Can be 1.HASH or 2.HASH + RANGE ==> Searchable term
          },
      ],
      AttributeDefinitions=[
          {
              'AttributeName': 'tn',
              'AttributeType': 'S'       # S = String, N = Numeric, B = Binary       
          },
      ],
      BillingMode='PROVISIONED',
      ProvisionedThroughput=
        {
          'ReadCapacityUnits': 5,
          'WriteCapacityUnits': 5
        },
  )

In [41]:
ghmarker.name

'ghmarker'

In [42]:
ghmarker.table_status

'CREATING'

In [43]:
ghm_table = dynamodb.Table('ghmarker')

In [44]:
ghm_table.table_status

'ACTIVE'

### 4. Create Data table : ghrepos

In [45]:
ghrepos = dynamodb.create_table(
      TableName='ghrepos',
      KeySchema=[
          {
              'AttributeName': 'id',
              'KeyType': 'HASH'
          },
      ],
      AttributeDefinitions=[
          {
              'AttributeName': 'id',
              'AttributeType': 'N'
          },
      ],
      BillingMode='PAY_PER_REQUEST'
  )

In [46]:
ghrepos.name

'ghrepos'

In [47]:
ghrepos.table_status

'CREATING'

In [48]:
ghr_table = dynamodb.Table('ghrepos')

In [49]:
ghr_table.table_status

'ACTIVE'