# Catchall

This notebook creates the Nutrition and the Magazine tables in the staging area.

## A. Adding Nutrition to staging
- This table was not used in any of the previous projects, so we'll use this catchall file to add it to staging

Limit 5 to view the columns and data types

In [None]:
%%bigquery
select * from magazine_recipes_raw.nutrition
limit 5

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,recipe_id,protien,carbo,alcohol,total_fat,sat_fat,cholestrl,sodium,iron,vitamin_c,vitamin_a,fiber,pcnt_cal_carb,pcnt_cal_fat,pcnt_cal_prot,calories,load_time
0,559,8.1,0.78,0.0,2.85,1.07,23.86,58.48,1.03,0.92,0.32,0.0,5.11,41.98,52.91,61.2,2024-01-27 00:11:11.060078+00:00
1,838,4.29,20.0,0.0,2.13,0.6,0.0,136.5,0.0,0.0,0.0,0.0,68.77,16.48,14.75,116.33,2024-01-27 00:11:11.060078+00:00
2,858,0.31,7.56,0.0,1.49,1.29,0.0,69.42,0.1,0.0,0.0,0.0,67.31,29.93,2.77,44.95,2024-01-27 00:11:11.060078+00:00
3,873,8.08,11.71,0.0,2.15,1.34,8.54,255.96,0.12,2.39,78.08,0.0,47.57,19.62,32.8,98.48,2024-01-27 00:11:11.060078+00:00
4,874,8.03,11.37,0.0,8.15,5.07,33.16,119.56,0.12,2.32,309.88,0.0,30.13,48.59,21.27,150.94,2024-01-27 00:11:11.060078+00:00


We use except() so that load_time can be placed last. This completes the nutrition table in staging

In [None]:
%%bigquery
CREATE OR REPLACE TABLE magazine_recipes_stg.Nutrition AS
select * except(load_time), 'bird' as data_source, load_time from magazine_recipes_raw.nutrition

Query is running:   0%|          |

## B. Creating Magazine Table
- This is just a place-holder table. It has foreign keys and primary keys but the rest of the columns will be populated using AI

In [None]:
from google.cloud import bigquery

client = bigquery.Client()

table_name = 'Magazines'

schema = [
  bigquery.SchemaField("magazine_id", "INTEGER", mode="REQUIRED"),
  bigquery.SchemaField("magazine_name", "STRING", mode="NULLABLE"),
  bigquery.SchemaField("website", "STRING", mode="NULLABLE"),
  bigquery.SchemaField("pub_frequency_weeks", "INTEGER", mode="NULLABLE"),
  bigquery.SchemaField("publishing_company", "STRING", mode="NULLABLE"),
  bigquery.SchemaField("subscription_price", "INTEGER", mode="NULLABLE"),
  bigquery.SchemaField("data_source", "STRING", mode="NULLABLE"),
  bigquery.SchemaField("load_time", "TIMESTAMP", mode="REQUIRED", default_value_expression="CURRENT_TIMESTAMP"),
]

table_ref = client.dataset("magazine_recipes_stg").table(table_name)
table = bigquery.Table(table_ref, schema=schema)

client.create_table(table)

rows_to_insert = []
table_ref = client.dataset("magazine_recipes_stg").table('Magazines')

for i in range(1, 21):
    row = {"magazine_id": i, "magazine_name": None, "website": None, "pub_frequency_weeks": None, "publishing_company" : None, 'subscription_price' : None, 'data_source': None, "load_time": None}
    rows_to_insert.append(row)

errors = client.insert_rows(table_ref, rows_to_insert, schema)

if errors == []:
    print("Rows inserted successfully.")
else:
    print("Encountered errors while inserting rows:", errors)


Rows inserted successfully.


# Primary & Forgein Keys

## Nutrition
Enforcing keys for the nutrition table

In [None]:
%%bigquery
alter table magazine_recipes_stg.Nutrition add primary key (recipe_id) not enforced

Query is running:   0%|          |

Verifying that the primary keys are consistent and have no duplicates in Nutrition.

In [None]:
%%bigquery
select recipe_id, count(*) as duplicate_records
from magazine_recipes_stg.Nutrition
group by recipe_id
having count(*) > 1

Query is running:   0%|          |

Downloading: |          |

Unnamed: 0,recipe_id,duplicate_records


Adding foreign keys to Nutrition.

In [None]:
%%bigquery
alter table magazine_recipes_stg.Nutrition add foreign key (recipe_id)
  references magazine_recipes_stg.Recipes (recipe_id) not enforced;

Query is running:   0%|          |

Verifying that there are no orphan records in Nutrition

In [None]:
%%bigquery
select count(*) as orphan_records
from magazine_recipes_stg.Nutrition
where recipe_id not in (select recipe_id from magazine_recipes_stg.Recipes)

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,orphan_records
0,0


## Magazines Primary and Foreign Keys

In [None]:
%%bigquery
alter table magazine_recipes_stg.Magazines add primary key (magazine_id) not enforced

Query is running:   0%|          |

Adding primary keys and verifying that there are no duplicate records

In [None]:
%%bigquery
select magazine_id, count(*) as duplicate_records
from magazine_recipes_stg.Magazines
group by magazine_id
having count(*) > 1

Query is running:   0%|          |

Downloading: |          |

Unnamed: 0,magazine_id,duplicate_records


## Cleanup
none needed!