### Article Details
The website is to allow the user to select a date on the calendar and then the list of the articles that were used in the prediction are to be displayed. The endpoint that powers this is to accept the date and then return the titles of the articles as well as the source URL.  There are serve thousand of these articles and these need to placed in a repo.  

AWS DynamoDB is to be used to store the articles metadata.

#### 1 Import Dependency
The keys for AWS are stored within secrets.py.

In [1]:
import json
import datetime
import boto3
from boto3.dynamodb.conditions import Key, Attr

from secrets import AwsAccesKeyID
from secrets import AwsScretKey

#### 2 Const
Define the variables that are to used through the notebook

In [2]:
awsRegionName = "us-west-2"
tableName = "articlesRepo"
publishDateIndexName = "PublishDateIndex"

#### 3 Reference DynamoDB
Reference the dynamoDB using information from the secrets.py

In [3]:
dynamodb = boto3.resource('dynamodb',
                        region_name=awsRegionName,
                        aws_access_key_id = AwsAccesKeyID,
                        aws_secret_access_key = AwsScretKey)

print("--> Completed reference to dynamodb")

--> Completed reference to dynamodb


#### 4 Create Table
Create table using the ID as primary key

In [4]:
repoTable = dynamodb.create_table(
    TableName = tableName,
    # Declare Primary Key with KeySchema
    KeySchema =[
        {
            "AttributeName" : "ID",
            "KeyType": "HASH"
        }
    ],
    # Declare AttributeDefinition
    AttributeDefinitions=[
        {
            "AttributeName" : "ID",
            "AttributeType" : "S"
        }
    ],
    ProvisionedThroughput={
        'ReadCapacityUnits': 1,
        'WriteCapacityUnits': 1
    }
    )

print("--> Completed creating table")

--> Completed creating table


#### 5 Create Secondary Index
Index is used for querying by date. Must wait until the resource has been created before adding the secondary index; wait 10 seconds before running.

In [6]:
dynamodbClient = boto3.client('dynamodb',
                        region_name=awsRegionName,
                        aws_access_key_id = AwsAccesKeyID,
                        aws_secret_access_key = AwsScretKey)

dynamodbClient.update_table(
    TableName = tableName,
    AttributeDefinitions =[
        {
            "AttributeName" : "publishdate",
            "AttributeType" : "N"
        }
    ],
    GlobalSecondaryIndexUpdates=[
        {
            "Create" : {
                "IndexName": "PublishDateIndex",
                "KeySchema" : [
                    {
                        "AttributeName" : "publishdate",
                        "KeyType" : "HASH"
                    }
                ],
                "Projection" : {
                    "ProjectionType" : "ALL"
                },
                "ProvisionedThroughput": {
                        "ReadCapacityUnits": 1,
                        "WriteCapacityUnits": 1,
                }
            }
        }
    ]
)

print("--> Completed secondary index")

--> Completed secondary index


#### 6 Load Article Data
.JSON file contains article information

In [7]:
sourceFile = "articles_all.json"
    
with open(sourceFile) as jsonFile:
    sourceData = json.load(jsonFile)
    
print(f"--> Completed loading file; number of articles: {len(sourceData['articles'])}")

--> Completed loading file; number of articles: 6268


#### 7 Load Articles into Table
Using the articles found within the JSON, load into table.

In [16]:
counter = 0
reportIncrement = len(sourceData['articles']) / 10
reportValue = reportIncrement

try:
       
    for article in sourceData['articles']:
    
        #- Get Image URL; not all articles have image
        sourceImageUrl = article['imageurl']

        if (sourceImageUrl == ""):
            sourceImageUrl = "NA"


        #- Load Record
        repoTable.put_item(Item={
            "ID" : article["id"],
            "title" : article['title'],
            "sourceurl" : article['sourceurl'],
            "imageurl" : sourceImageUrl,
            "publishdate" : int(article['publishdate'])
            })

        
        #- Update Counter
        counter +=1
        
        if (counter > reportValue):
            print(counter)
            reportValue = reportValue + reportIncrement
    

except Exception as e:
    print(e)
    print(f"Error loading data: {counter}")
    
        
print("-> Completed loading values")

627
1254
1881
2508
3135
3761
4388
5015
5642
-> Completed loading values
