# SLQ Self-sufficiency exam

Instructions:
1. submit a single text file that contains the queries you generated for each prompt. You can indicate which question each query is for by using a code comment like what you see below.

2. evaluated using 2 criteria:

First, the grader will look to see that you have created the correct query. For each of the prompts below, there is a single, correct answer. 

Second, the grader will look at how you've styled your queries. They'll gauge whether you followed the recommended style guidelines for this program. In particular, they'll look at whether you capitalized SQL keywords, used lowercase letters for table and field names, and used multiple lines to make your queries more readable.


    1. Put each column name in a select clause on its own line, with one level of indentation from the preceding line.
    2. Follow the same indentation logic for FROM, WHERE, and ORDER BY blocks, giving each element its own line.
    3. Similarly, each clause gets its own line.
    4. Use all caps for clauses, function names, and the like.
    5. Use the actual case of the column/table name when referring to column and table names.
    6. Be consistent in your own use of casing, but recognize that SQL is not case sensitive, and it doesn't actually care about tabs and newlines.


# The scenario

You are a data analyst for your state’s department of education. You're given a database containing 2 tables: *naep* and *finance*. NAEP is the National Assessment of Educational Progress for states. The naep table contains each state’s average NAEP scores in math and reading for students in grades 4 and 8 for various years between 1992 and 2017. The finance table contains each state’s total K-12 education revenue and expenditures for the years 1992 through 2016.

You are tasked with assessing the quality of this data. You must also find useful ways to analyze it.

### 1. Write a query that allows you to inspect the schema of the naep table.

--1
SELECT *
FROM
	information_schema.columns
WHERE
	table_schema = 'public'
	AND table_name = 'naep';

### 2. Write a query that returns the first 50 records of the naep table.

--2
SELECT*
FROM
	public.naep
LIMIT
	50

### 3. Write a query that returns summary statistics for avg_math_4_score by state. Make sure to sort the results alphabetically by state name.

--3
SELECT
	state,
	COUNT(avg_math_4_score),
	AVG(avg_math_4_score) as average,
	MIN(avg_math_4_score) as minimum,
	MAX(avg_math_4_score) as maximum
FROM
	public.naep
GROUP BY state
ORDER BY
	state;

### 4. Write a query that alters the previous query so that it returns only the summary statistics for avg_math_4_score by state with differences in max and min values that are greater than 30.

--4
SELECT
	state, 
	COUNT(avg_math_4_score),
	AVG(avg_math_4_score) as average,
	MIN(avg_math_4_score) as minimum,
	MAX(avg_math_4_score) as maximum
FROM
	public.naep
GROUP BY state
HAVING 
	MAX(avg_math_4_score) - MIN(avg_math_4_score) > 30
ORDER BY
	state;


### 5. Write a query that returns a field called bottom_10_states that lists the states in the bottom 10 for avg_math_4_score in the year 2000.

--5
SELECT
	state, 
	avg_math_4_score as bottom_10,
	year
FROM
	public.naep
WHERE
	year = 2000 and
	avg_math_4_score IS NOT NULL
GROUP BY 
	avg_math_4_score, 
	state,
	year
ORDER BY
	avg_math_4_score DESC 
LIMIT 10;

### 6. Write a query that calculates the average avg_math_4_score rounded to the nearest 2 decimal places over all states in the year 2000.

--6 
SELECT
	ROUND(AVG(avg_math_4_score),2),
	year
FROM
	public.naep
WHERE
	year = 2000
GROUP BY
	year; 

### 7. Write a query that returns a field called below_average_states_y2000 that lists all states with an avg_math_4_score less than the average over all states in the year 2000.

SELECT
	state, 
	avg_math_4_score as below_average_states_y2000,
	year
FROM
	public.naep
WHERE
	year = 2000 and	
	avg_math_4_score < (SELECT AVG(avg_math_4_score) 
						   FROM public.naep)
GROUP BY	
	state,
	year,
	avg_math_4_score
ORDER BY
	avg_math_4_score,
	state,
	year; 


### 8. Write a query that returns a field called scores_missing_y2000 that lists any states with missing values in the avg_math_4_score column of the naep data table for the year 2000. 

--8
SELECT
	state as scores_missing_y2000, 
	avg_math_4_score, 
	year
FROM
	public.naep
WHERE
	year = 2000 and	
	avg_math_4_score is null
GROUP BY	
	state,
	year,
	avg_math_4_score
ORDER BY
	state,
	avg_math_4_score,
	year; 

### 9. Write a query that returns for the year 2000 the state, avg_math_4_score, and total_expenditure from the naep table left outer joined with the finance table, using id as the key and ordered by total_expenditure greatest to least. Be sure to round avg_math_4_score to the nearest 2 decimal places, and then filter out NULL avg_math_4_scores in order to see any correlation more clearly.

--9 
SELECT
	naep.id,
	naep.state, 
	ROUND(naep.avg_math_4_score, 2) as avg_math_4_score,
	naep.year,
	finance.id,
	finance.total_expenditure
FROM
	public.naep
LEFT JOIN 
	public.finance ON
	finance.id = naep.id
WHERE
	naep.year = 2000 and	
	avg_math_4_score IS NOT NULL 
GROUP BY	
	naep.id,
	finance.id,
	naep.year,
	naep.state,
	finance.total_expenditure,
	naep.avg_math_4_score
ORDER BY
	total_expenditure,
	state,
	avg_math_4_score,
	year; 