# What is a Database?
A `database` is a collection of data (information) stored electronically in a structured manner. The way the data is organized determines the type of database.

## Types of Database

### Hierarchical Databases
Just as in any hierarchy, this database categorizes data in ranks or levels, and the ranks are expressed using links. It is organized in a parent-child relationship, and the entire database would resemble a tree.

</br></br><img src="https://i.imgur.com/23nTU9D.png" width="1000"></br></br>

#### Components of a Hierarchical Database
A `hierarchical` database consists of a collection of records connected to each other through links. Each record is a collection of fields or attributes, each of which contains only one data value. A child record can only be linked to only one parent record.

`record`[`fields/attributes`] <-- (`link`) --> `record`[`fields/attributes`]

### Network Databases
A `network` database is a hierarchical database except that the child records can be associated with multiple parent records.

</br></br><img src="https://i.imgur.com/MhP6vhP.png" width="1000"></br></br>

### Relational Databases
`Relational` databases are the most mature of all databases. Every record can be linked to every other piece of record.

#### Components of a Relational Database
- Attributes
- Entity
- Relationships
- Cardinality

### NoSQL Databases
A `NoSQL` is a non-relational database that provides a mechanism for the storage and retrieval of data. A NoSQL database includes simplicity of design, simpler horizontal scaling to clusters of machines, and finer control over availability. The data structures used by NoSQL databases are very different from tables in relational databases, which makes some operations faster in NoSQL.

## Database Management Systems
A `Database Management System` is a computerized data-keeping system. Users of the system are given facilities to perform several kinds of operations to either manipulate the data or manage the database structure itself. They are categorized according to their database structure or type: hierarchical, network, and relational.

# Relational Databases
A database stores data in an organized way so that it can be searched and retrieved later. The database is made up of one or more tables. All rows have the same columns, and each column contains the data. Data can be inserted (created), retried, updated, and deleted from a table--CRUD abbreviated.

A relational database is a type of database that organizes data into tables and creates links between these tables, based on defined relationships. These relationships enables the user to retrieve and combine data from one or more tables with a single query.

Below, we'll be using Foursquare API to build a database of bars and venues that accommodates live bands in Toronto and the GTA.

In [13]:
from dotenv import load_dotenv

import requests
import os
import pandas as pd

In [14]:
load_dotenv(override=True)
FSQ_API_KEY = os.environ["FOURSQUARE_API_KEY"]

To understand how to search the FSQ API for certain types of venues, we must first understand the taxonomy of their categories. FSQ Places includes a hierarchical taxonomy of categories from which each POI record is classified. The 10 parent categories are:
- 10xxx Arts and Entertainment
- 11xxx Business and Professional Services
- 12xxx Community and Government
- 13xxx Dining and Drinking
- 14xxx Event
- 15xxx Health and Medicine
- 16xxx Landmarks and Outdoors
- 17xxx Retail
- 18xxx Sports and Recreation
- 19xxx Travel and Transportation

Each category is ID'd numerically as `10xxx` to `19xxx` in alphabetical order. Each subcategory is then organized numerically with the 3 remaining digits, `xxx`. Because we're looking for venues for live bands, we'll trim our options down to the following parent categories:
- 100xx Arts and Entertainment
- 13xxx Dining and Drinking
- 14xxx Event

Use pandas to import category list and convert to Dataframe to manipulate into usable format.

In [15]:
df = pd.read_csv("./_data/places-and-apiv3-categories.csv", index_col=0)

In [16]:
df_split = df["Category Label"].str.split(' > ', expand=True)
df_split.head(5)

Unnamed: 0,0,1,2,3,4,5
10000,Arts and Entertainment,,,,,
10001,Arts and Entertainment,Amusement Park,,,,
10002,Arts and Entertainment,Aquarium,,,,
10003,Arts and Entertainment,Arcade,,,,
10004,Arts and Entertainment,Art Gallery,,,,


In [17]:
query = ""
results = df_split[df_split[1] == query.capitalize()]
results.info()

<class 'pandas.core.frame.DataFrame'>
Index: 0 entries
Data columns (total 6 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   0       0 non-null      object
 1   1       0 non-null      object
 2   2       0 non-null      object
 3   3       0 non-null      object
 4   4       0 non-null      object
 5   5       0 non-null      object
dtypes: object(6)
memory usage: 0.0+ bytes


In [18]:
url = "https://api.foursquare.com/v3/places/search"
headers = {
  "accept" : "application/json",
  "Authorization" : FSQ_API_KEY
}

params = {
  "query" : "music",
  "near" : "Toronto"
}

response = requests.get(url, params=params, headers=headers)

In [19]:
response.json()

{'results': [{'fsq_id': '4b31677bf964a520530625e3',
   'categories': [{'id': 12054,
     'name': 'Music School',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/building/school_music_',
      'suffix': '.png'}}],
   'chains': [],
   'distance': 3890,
   'geocodes': {'main': {'latitude': 43.668027, 'longitude': -79.396069},
    'roof': {'latitude': 43.668027, 'longitude': -79.396069}},
   'link': '/v3/places/4b31677bf964a520530625e3',
   'location': {'address': '273 Bloor St W',
    'country': 'CA',
    'formatted_address': '273 Bloor St W, Toronto ON M5S 1W2',
    'locality': 'Toronto',
    'postcode': 'M5S 1W2',
    'region': 'ON'},
   'name': 'Royal Conservatory of Music',
   'related_places': {'children': [{'fsq_id': '4adf787cf964a520e57a21e3',
      'name': 'Koerner Hall'},
     {'fsq_id': '4ba79edaf964a52020a339e3', 'name': 'B Expresso Bar'}]},
   'timezone': 'America/Toronto'},
  {'fsq_id': '4ad4c062f964a520c1f720e3',
   'categories': [{'id': 10039,
     'name': '

In [20]:
df = pd.DataFrame(response.json()['results'])

In [21]:
df

Unnamed: 0,fsq_id,categories,chains,distance,geocodes,link,location,name,related_places,timezone
0,4b31677bf964a520530625e3,"[{'id': 12054, 'name': 'Music School', 'icon':...",[],3890,"{'main': {'latitude': 43.668027, 'longitude': ...",/v3/places/4b31677bf964a520530625e3,"{'address': '273 Bloor St W', 'country': 'CA',...",Royal Conservatory of Music,{'children': [{'fsq_id': '4adf787cf964a520e57a...,America/Toronto
1,4ad4c062f964a520c1f720e3,"[{'id': 10039, 'name': 'Music Venue', 'icon': ...",[],7863,"{'main': {'latitude': 43.629264, 'longitude': ...",/v3/places/4ad4c062f964a520c1f720e3,"{'address': '909 Lake Shore Blvd W', 'country'...",Budweiser Stage,{'parent': {'fsq_id': '4c10206ae7d295216fbcd7b...,America/Toronto
2,4af5d9a9f964a520cafd21e3,"[{'id': 17098, 'name': 'Music Store', 'icon': ...",[],4377,"{'main': {'latitude': 43.661443, 'longitude': ...",/v3/places/4af5d9a9f964a520cafd21e3,"{'address': '925 Bloor St W', 'country': 'CA',...",Long and McQuade,{},America/Toronto
3,4ad4c062f964a520f2f720e3,"[{'id': 10037, 'name': 'Concert Hall', 'icon':...",[],5443,"{'main': {'latitude': 43.676479, 'longitude': ...",/v3/places/4ad4c062f964a520f2f720e3,"{'address': '147 Danforth Ave', 'country': 'CA...",The Music Hall,{},America/Toronto
4,4de94b3cd4c0faa56445dbee,"[{'id': 10037, 'name': 'Concert Hall', 'icon':...",[],7859,"{'main': {'latitude': 43.629147, 'longitude': ...",/v3/places/4de94b3cd4c0faa56445dbee,"{'address': '909 Lake Shore Blvd W', 'country'...",TD Echo Beach,{'children': []},America/Toronto
5,514cc159e4b0e4f73af4eced,"[{'id': 10040, 'name': 'Jazz and Blues Venue',...",[],5764,"{'main': {'latitude': 43.655781, 'longitude': ...",/v3/places/514cc159e4b0e4f73af4eced,"{'address': '251 Victoria St', 'country': 'CA'...",Jazz Bistro,{},America/Toronto
6,4baa0f32f964a5205b473ae3,"[{'id': 16017, 'name': 'Garden', 'icon': {'pre...",[],7250,"{'main': {'latitude': 43.637096, 'longitude': ...",/v3/places/4baa0f32f964a5205b473ae3,"{'address': '475 Queens Quay W', 'country': 'C...",Toronto Music Garden,{},America/Toronto
7,4ad4c05df964a52067f620e3,"[{'id': 10039, 'name': 'Music Venue', 'icon': ...",[],5615,"{'drop_off': {'latitude': 43.649612, 'longitud...",/v3/places/4ad4c05df964a52067f620e3,"{'address': '249 Ossington Ave', 'country': 'C...",The Dakota Tavern,{},America/Toronto
8,4b522e7df964a520ec6d27e3,"[{'id': 17098, 'name': 'Music Store', 'icon': ...",[],10822,"{'main': {'latitude': 43.781526, 'longitude': ...",/v3/places/4b522e7df964a520ec6d27e3,"{'address': '2777 Steeles Ave W', 'address_ext...",Long and McQuade,{},America/Toronto
9,551128d4498e2e50d8f03c5b,"[{'id': 11167, 'name': 'Technology Business', ...",[],7240,"{'main': {'latitude': 43.650115, 'longitude': ...",/v3/places/551128d4498e2e50d8f03c5b,"{'address': '179 John St', 'country': 'CA', 'c...",Spotify,{},America/Toronto


In [22]:
df_categories = pd.DataFrame(category[0] for category in df['categories'])
df_categories

Unnamed: 0,id,name,icon
0,12054,Music School,{'prefix': 'https://ss3.4sqi.net/img/categorie...
1,10039,Music Venue,{'prefix': 'https://ss3.4sqi.net/img/categorie...
2,17098,Music Store,{'prefix': 'https://ss3.4sqi.net/img/categorie...
3,10037,Concert Hall,{'prefix': 'https://ss3.4sqi.net/img/categorie...
4,10037,Concert Hall,{'prefix': 'https://ss3.4sqi.net/img/categorie...
5,10040,Jazz and Blues Venue,{'prefix': 'https://ss3.4sqi.net/img/categorie...
6,16017,Garden,{'prefix': 'https://ss3.4sqi.net/img/categorie...
7,10039,Music Venue,{'prefix': 'https://ss3.4sqi.net/img/categorie...
8,17098,Music Store,{'prefix': 'https://ss3.4sqi.net/img/categorie...
9,11167,Technology Business,{'prefix': 'https://ss3.4sqi.net/img/categorie...


In [23]:
df_categories['icon'][0]

{'prefix': 'https://ss3.4sqi.net/img/categories_v2/building/school_music_',
 'suffix': '.png'}

## Entity Relationship Diagrams

A common practice for visualizing relational data is through **Entity Relationship Diagrams** (ERD).

An ERD can be used to understand, explore, and document a database. It can also help troubleshoot logic or deployment problems, spot inefficiencies, and help improve processes. For designing and modeling new databases, ERDs are indispensible.

### Best Practices
- `Identify the Entities`: Identify all of the entities. An entity is nothing more than a description of a particular object that the system stores information about. Compartmentalize as best as possible.
- `Identify the Relationship`: Reflect on how the two entities are related. Draw a solid line and label the line with a brief of description of relation.
- `Add the Attributes`: Any key attributes of entities should be added using oval symbols.
- `Complete the Diagram`: Continue to connect the entities with lines and add diamonds to describe each relationship until all relationships have been described. Use crow-foot's notation to complete the definitions.