The project contains two tasks that pull data off starwar API and stores into MySQL database using dal (data access layer).
- Create virtualenv
virtualenv venv
- Activate virtualenv
source venv/bin/activate
- Install dependencies (using virtual environment is recommended):
pip install -r requirements.txt
mysql
database installation - Note : This project has been tested againstmysql-5.6.47
usingOSX
native package installation. More instructions here
Database setup instructions -
- To setup, database use sql script
database_.sql
(contains DDL). - You should have
settings/secrets.yaml
for database credentials
# ---LOCAL---
LOCALSQL_USER: root
LOCALSQL_HOST: 127.0.0.1
LOCALSQL_PORT: 3306
LOCALSQL_PASSWORD: xxxxx
LOCALSQL_DATABASE: starwarsDB
pymysql
installation and
related versions of mysql
should be matched.
- Activate virtual env
source venv/bin/activate
- Start the app with
python task_one.py
Good luck!
[TODO] to be added soon.
The database modeling for the starwars API goes like follows:
Name of models -
characters (aka people)
film
species
vehicle
planets
starships
sql script establishes ``many-to-many`` with `characters` and `film` table with other tables.
For example,
people (aka characters) <---> film :: many-to-many
people <---> species :: many-to-many
people <---> vehicles :: many-to-many
people <---> planets :: many-to-many
people <---> starships :: many-to-many
[TODO] More description to be added here.
Example,
```
# ---LOCAL---
LOCALSQL_USER: root
LOCALSQL_HOST: 127.0.0.1
LOCALSQL_PORT: 3306
LOCALSQL_PASSWORD: xxxx
LOCALSQL_DATABASE: starwarsDB
```
NOTE : Readers are requested to raise PRs with `problems identified` with the script.
- Since the code has been written in Python3.7, function annotations and type-hinting has been
used across. - Google-Styled docstrings have been used to describe functions/classes/modules.
- Pydantic data-classes have been used to validate the responses from starwar API endpoints.
- Set your IDE character limit per line to maximum 100 (recommendation)
- Set your configurations via
settings/secrets.yaml
; Do NOT commit file containing secrets. - The generic functionality has been maintained under
commons
.
The Star Wars API lists 87 main characters in the Star Wars saga. For the first task, we would
like you to use a random number generator that picks a number between 1-87. Using these
random numbers you will be pulling 15 characters from the API using Python.
OUTPUT OF TASK 1 (as of timestamp: '2019-10-20 18:31:20')
[
{
"film": "Attack of the Clones",
"characters": [
{
"name": "Anakin Skywalker",
"homeworld": "https://swapi.co/api/planets/1/",
"gender": "male"
}
]
},
{
"film": "Revenge of the Sith",
"characters": [
{
"name": "Anakin Skywalker",
"homeworld": "https://swapi.co/api/planets/1/",
"gender": "male"
}
]
},
{
"film": "The Phantom Menace",
"characters": [
{
"name": "Anakin Skywalker",
"homeworld": "https://swapi.co/api/planets/1/",
"gender": "male"
}
]
}
]
The task 2 goes like following:
1. Pull data for the movie A New Hope
2. Replace the data for each of the endpoints listed in the JSON object you
receive from the API request (e.g. - In the example above you would take
all the character endpoints and pull the data from each of those
endpoints then insert the data into the JSON object, etc.)
a. A New Hope has character, planet, starship, vehicle, and species
data you will need to retrieve and replace.
3. We also ask that you convert the metric heights and weights of each
character to standard units.
4. You will also need to remove all cross-referencing material from the
data you replace (e.g. - When you pull Luke Skywalker you would want
to remove cross-referencing URLs from Luke’s JSON object (like films,
species, vehicles, and spaceships.)
5. Lastly, you will take the dictionary you created and write it out to
a JSON file locally named task_two.json.
Random number generator may produce some integers IDs within range(1, 87)
which may not yield any
results from starwars API (404s). In which case, we skip those IDs and store the rest (fair enough?)
** [TODO] try another approach ** -
Crawl through all the urls from starwars API first, resolve dependecies endpoint-by-endpoint
and store into record tables and relationship tables.
Finally, use local database to produce results per ask in the task.