## Leveraging micro-partitions and data clustering

During a quick chat in the hall with your Lead Data Engineer, she shared with you that Snowflake is using data clustering to sort data within micro-partitions by the year field in the olympic_medals table. You have a few queries that you regularly execute against this table, which you'd like to update to better take advantage of Snowflake's micro-partitions and data clustering.

The create_engine function from the sqlalchemy module has been imported, and a connection object has been created and stored in the variable conn.

### Instructions
    - Update the Snowflake query to only return records for games that took place in 2000 later.
    - Return the results of the Snowflake query as a pandas DataFrame, and print the result set.

In [None]:
# Leverage the existing micro-partitions and data clustering
query = """
SELECT
	team,
    year,
    sport,
    event,
    medal
FROM olympic_medals
WHERE year >= 2000;
"""

# Execute the query, print the results
results = conn.cursor().execute(query).fetch_pandas_all()
print(results)

## Querying semi-structured data in Snowflake

With Snowflake, semi-structured data can be stored in its most raw form. Here, information about a handful of Olympic host cities is stored in the city_meta column of the host_cities table. This column takes type VARIANT, allowing for unstructured data to be stored in this single column. The data takes the form below:

<br/>![](../../imgs/2_4-Exercises.png)<br/>

Snowflake table with a single column of type VARIANT.

In this exercise, you'll practice querying this data using both bracket and dot notations. A connection object conn for the olympics database has been created for you. Good luck!

### Instructions
    - Use dot-notation to retrieve the city field from the city_meta column in the host_cities table.
    - Use dot-notation to query the nested country field from the city_meta column in the host_cities table.

In [None]:
# Build a query to pull city and country names
query = """
SELECT
	city_meta:city,
    city_meta:country
FROM host_cities;
"""

# Execute query and output results
results = conn.cursor().execute(query).fetch_pandas_all()
print(results)

## Querying nested semi-structured data

Within columns of type VARIANT, data can often be nested, as we see in the city_meta column of the host_cities table below.

<br/>![](../../imgs/2_4-Exercises-2.png)<br/>

Snowflake table with a single column of type VARIANT.

A Snowflake connection object to the database olympics has been created, and is available in the variable conn. Happy querying!

### Instructions
    - Using dot notation, complete the query to extract data from the nested lat field in the coordinates object.
    - Query the nested long field from the coordinates object in the city_meta column.
    - Execute the query, and print the results.

In [None]:
# Build a query to extract nested location coordinates
query = """
SELECT
	city_meta:coordinates.lat,
    city_meta:coordinates.long
FROM host_cities;
"""

# Execute the query and output the results
results = conn.cursor().execute(query).fetch_pandas_all()
print(results)