In this first chapter, you’ll learn how to query a films database and select the data needed to answer questions about the movies and actors. You'll also understand how SQL code is executed and formatted.

## Querying a database

#### Learning to COUNT()
You saw how to use COUNT() in the video. Do you remember what it returns?

Here is a query counting film_id. Select the answer below that correctly describes what the query will return.

SELECT COUNT(film_id) AS count_film_id
FROM reviews;
Run the query in the console to test your theory!

In [24]:
import pandas as pd

films = pd.read_csv('films.csv', header=None) # read CSV file without column names,
reviews = pd.read_csv('reviews.csv',header=None)
people = pd.read_csv('people.csv',header=None)
roles = pd.read_csv('roles.csv',header=None)

films.columns = ['id', 'title', 'release_year','country','duration','language','certification','gross','budget'] # assign column names
reviews.columns = ['film_id','num_user','num_critic','imdb_score','num_votes','facebook_likes']
people.columns = ['id','name','birthdate','deathdate']
roles.columns = ['id','film_id','persons_id','role']

films

Unnamed: 0,id,title,release_year,country,duration,language,certification,gross,budget
0,1,Intolerance: Love's Struggle Throughout the Ages,1916.0,USA,123.0,,Not Rated,,385907.0
1,2,Over the Hill to the Poorhouse,1920.0,USA,110.0,,,3000000.0,100000.0
2,3,The Big Parade,1925.0,USA,151.0,,Not Rated,,245000.0
3,4,Metropolis,1927.0,Germany,145.0,German,Not Rated,26435.0,6000000.0
4,5,Pandora's Box,1929.0,Germany,110.0,German,Not Rated,9950.0,
...,...,...,...,...,...,...,...,...,...
4963,4964,Unforgotten,,UK,45.0,English,,,
4964,4965,Wings,,USA,30.0,English,,,
4965,4966,Wolf Creek,,Australia,,English,,,
4966,4967,Wuthering Heights,,UK,142.0,English,,,


In [32]:
'''
from pandasql import sqldf
# Define an SQL query
query = "SELECT COUNT(film_id) AS count_film_id \
         FROM data.reviews;"

# Run the query using sqldf()
result = sqldf(query,locals())

# Print the result
print(result)
'''

##############################################################################################################3
'''
from sqlalchemy import create_engine
import pandas as pd

# Create a SQL database using PostgreSQL
engine = create_engine('postgresql://username:password@hostname/database_name')

reviews.to_sql('reviews', engine, if_exists='replace', index=False)


# Define the SQL query using SQLAlchemy syntax
query = "SELECT COUNT(film_id) AS count_film_id \
         FROM public.reviews;"

# Run the query using pandas
result = pd.read_sql_query(query, engine)

# Print the result
print(result)
'''


'\nfrom sqlalchemy import create_engine\nimport pandas as pd\n\n# Create a SQL database using PostgreSQL\nengine = create_engine(\'postgresql://username:password@hostname/database_name\')\n\nreviews.to_sql(\'reviews\', engine, if_exists=\'replace\', index=False)\n\n\n# Define the SQL query using SQLAlchemy syntax\nquery = "SELECT COUNT(film_id) AS count_film_id          FROM public.reviews;"\n\n# Run the query using pandas\nresult = pd.read_sql_query(query, engine)\n\n# Print the result\nprint(result)\n'

![Learning%20to%20COUNT%28%29.png](attachment:Learning%20to%20COUNT%28%29.png)

Correct! COUNT(field_name) returns the number of records containing a value in a field. In this example, that field is film_id.

![2%20-%20practice%20with%20count%28%29.png](attachment:2%20-%20practice%20with%20count%28%29.png)

![2.2.%20Practice%20with%20COUNT%28%29%202.png](attachment:2.2.%20Practice%20with%20COUNT%28%29%202.png)

![2.3.%20Practice%20with%20count%28%29.png](attachment:2.3.%20Practice%20with%20count%28%29.png)

Tres Bien! Looking at the differences between the count of unique values, total values, and all records can provide useful insights into your data.

![3.1.%20select%20distinct.png](attachment:3.1.%20select%20distinct.png)

![3.2.%20select%20distinct%28%29.png](attachment:3.2.%20select%20distinct%28%29.png)

Congratulations! Using DISTINCT is a great tool to see the unique values of a dataset. This table has 64 unique countries.

## Query execution

![4.%20Order%20of%20execution.png](attachment:4.%20Order%20of%20execution.png)

Congratulations! This is the correct order of execution. It makes sense that SQL needs to SELECT data FROM a table before it can LIMIT the results.

![5.1.%20Debugging%20errors.png](attachment:5.1.%20Debugging%20errors.png)

![5.2.%20Debugging%20errors.png](attachment:5.2.%20Debugging%20errors.png)

![5.3.%20debugging%20errors.png](attachment:5.3.%20debugging%20errors.png)

Excellent extermination of those bugs! This is an important skill that will come in very handy.

## SQL style

![6.%20SQL%20best%20practices.png](attachment:6.%20SQL%20best%20practices.png)

Well done! You'll soon become everyone's favorite SQL programmer.

![7.%20Formatting.png](attachment:7.%20Formatting.png)

Great work formatting the code! Clean code allows for clean communication.

![8.%20non-standard%20fields.png](attachment:8.%20non-standard%20fields.png)

Correct! Using double quotes around a non-standard name allows us to run the SQL query.