# Using sql2csv documentation

Suppose you're trying to run a query with `sql2csv` but you've been having issues because the error message is not detailed enough to help debug the error. Which optional argument in `sql2csv` will print detailed tracebacks and logs when errors occur while using `sql2csv`?

- `-v` or `--verbose`

# Understand sql2csv connectors

Suppose you have a SQL database you would like to connect to using `sql2csv`, but you're not sure yet if this particular database can be connected to. sql2csv's manual does not readily have the list of possible database connectors, but `csvsql` does!

Could you use `csvsql`'s manual to check what SQL database connections are currently NOT supported for `sql2csv` and for the rest of the `csvkit` suite?

- MongoDB (Because it is NoSQL)

# Practice pulling data from database

With the powers of `csvkit`, we don't need to download and set up fancy database management software like MS SQL Server, DB2, PgAdmin, or TablePlus to be able to access the data inside a SQL database. We can pull data directly from our command line using `csvkit`'s `sql2csv` command.

In this practice, let's walk through pulling data step by step, by applying SQL manipulations to the table `Spotify_Popularity` which dwells inside a SQLite database called `SpotifyDatabase` and then saving the output of the SQL query to a local `.csv` file `Spotify_Popularity_5Rows.csv`.

```
# Verify database name 
ls

# Pull the entire Spotify_Popularity table and print in log
sql2csv --db "sqlite:///SpotifyDatabase.db" \
        --query "SELECT * FROM Spotify_Popularity" 
```

```
# Verify database name 
ls

# Query first 5 rows of Spotify_Popularity and print in log
sql2csv --db "sqlite:///SpotifyDatabase.db" \
        --query "SELECT * FROM Spotify_Popularity" \
        | csvlook         
```

```
# Verify database name 
ls

# Save query to new file Spotify_Popularity_5Rows.csv
sql2csv --db "sqlite:///SpotifyDatabase.db" \
        --query "SELECT * FROM Spotify_Popularity LIMIT 5" \
        > Spotify_Popularity_5Rows.csv

# Verify newly created file
ls

# Print preview of newly created file
csvlook Spotify_Popularity_5Rows.csv
```

# Applying SQL to a local CSV file

Sometimes the data manipulation we want to do is just easier to do with SQL. In this situation, we want to find the shortest duration song in Spotify_MusicAttributes.csv by applying the SQL below directly to the data file.

`SELECT * FROM Spotify_MusicAttributes ORDER BY duration_ms LIMIT 1`

Let's go through this step by step.

```
# Preview CSV file
ls

# Apply SQL query to Spotify_MusicAttributes.csv
csvsql --query "SELECT * FROM Spotify_MusicAttributes ORDER BY duration_ms LIMIT 1" Spotify_MusicAttributes.csv
```

```
# Reformat the output using csvlook 
csvsql --query "SELECT * FROM Spotify_MusicAttributes ORDER BY duration_ms LIMIT 1" \
	Spotify_MusicAttributes.csv | csvlook
```

```
# Re-direct output to new file: ShortestSong.csv
csvsql --query "SELECT * FROM Spotify_MusicAttributes ORDER BY duration_ms LIMIT 1" \
	Spotify_MusicAttributes.csv > ShortestSong.csv
    
# Preview newly created file 
csvlook ShortestSong.csv
```

# Cleaner scripting via shell variables

Because SQL queries, by nature, can be long and complex, we will frequently need to deal with line breaks while passing in SQL queries to csvkit commands.

One way to work around this is to store the SQL queries as a shell variable, then pass in the shell variable in place of the SQL query where needed.

```
# Preview CSV file
ls

# Store SQL query as shell variable
sqlquery="SELECT * FROM Spotify_MusicAttributes ORDER BY duration_ms LIMIT 1"

# Apply SQL query to Spotify_MusicAttributes.csv
csvsql --query "$sqlquery" Spotify_MusicAttributes.csv
```

# Joining local CSV files using SQL

`csvsql` can be used to join CSV files together even when neither of them are in a database. Here, we have two CSV files `Spotify_MusicAttributes.csv` and `Spotify_Popularity.csv` that are both on song level but contain different attributes for each song. We can combine the two files together using a SQL-like JOIN, and we can do so, through the power of `csvsql`.

- `csvcut -n Spotify_MusicAttributes.csv; csvcut -n Spotify_Popularity.csv;`
- track_id

```
# Store SQL query as shell variable
sql_query="SELECT ma.*, p.popularity FROM Spotify_MusicAttributes ma INNER JOIN Spotify_Popularity p ON ma.track_id = p.track_id"

# Join 2 local csvs into a new csv using the saved SQL
csvsql --query "$sql_query" Spotify_MusicAttributes.csv Spotify_Popularity.csv > Spotify_FullData.csv

# Preview newly created file
csvstat Spotify_FullData.csv

```

# Practice pushing data back to database

It is also possible to go the other way around and push local CSV files back to the database. As long as we specify the database as well as the CSV file to be loaded, `csvsql` does the rest of the work for us (e.g. inferring table schema), behind the scenes.

In the following exercise, complete the command to upload `Spotify_MusicAttributes.csv` as its own table in the SQLite database `SpotifyDatabase`. Then, as a sanity check, re-pull the data from the newly created table in the database.

```
# Preview file
ls

# Upload Spotify_MusicAttributes.csv to database
csvsql --db "sqlite:///SpotifyDatabase.db" --insert Spotify_MusicAttributes.csv

# Store SQL query as shell variable
sqlquery="SELECT * FROM Spotify_MusicAttributes"

# Apply SQL query to re-pull new table in database
sql2csv --db "sqlite:///SpotifyDatabase.db" --query "$sqlquery" 
```

# Database and SQL with csvkit

The addition of `csvsql` and `sql2csv` allows us to go through an entire data workflow inside the terminal without needing to install and set up additional SQL clients and software. In this capstone, we will put together and pull data from a SQLite database, merge this data with a locally saved file, and finally, push a final merged file back to the database, all without ever leaving the command line.

```
# Store SQL for querying from SQLite database 
sqlquery_pull="SELECT * FROM SpotifyMostRecentData"

# Apply SQL to save table as local file 
sql2csv --db "sqlite:///SpotifyDatabase.db" --query "$sqlquery_pull" > SpotifyMostRecentData.csv

# Store SQL for UNION of the two local CSV files
sqlquery_union="SELECT * FROM SpotifyMostRecentData UNION ALL SELECT * FROM Spotify201812"

# Apply SQL to union the two local CSV files and save as local file
csvsql 	--query "$sqlquery_union" SpotifyMostRecentData.csv Spotify201812.csv > UnionedSpotifyData.csv

# Push UnionedSpotifyData.csv to database as a new table
csvsql --db "sqlite:///SpotifyDatabase.db" --insert UnionedSpotifyData.csv
```