# 1. Overview

 In this mission, we'll explore how to interact with a SQLite database in Python so you can start to incorporate databases into your data science workflow.

`SQLite is a database that doesn't require a standalone server; it stores the entire database as a file on disk. This makes it ideal for working with larger datasets that can fit on disk but not in memory.`

Python loads the entire data set we're working with into memory, making SQLite a compelling alternative for working with data sets larger than 8 gigabytes (which is roughly the amount of memory modern computers contain). `The fact that we can contain an entire database in a single file makes them easy to share; some data sets are available online as SQLite database files (using the extension .db).`

We can interact with a SQLite database in two main ways:

* Through the [sqlite3 Python module](https://docs.python.org/3/library/sqlite3.html)
* Through the [SQLite shell](https://sqlite.org/cli.html)

# 2. Introduction to the Data

We'll continue to work with the American Community Survey data on college majors and job outcomes

The full table has many more columns than the ones we've displayed above (21 to be specific). You can learn about all of them in [FiveThirtyEight's GitHub repository.](https://github.com/fivethirtyeight/data/tree/master/college-majors)

Here are the descriptions for the columns in the preview:

* Rank: The major's rank by median earnings
* Major_code: The major's code or ID
* Major: The name of the major
* Major_category: The broader category the major belongs to
* Total: The total number of people who studied the major
* Sample_size: The sample size (unweighted) of graduates with full time jobs
* Men: The number of male graduates
* Women: The number of female graduates
* ShareWomen: Women as a proportion of the total number of graduates (a number ranging from 0 to 1)
* Employed: The number of employed graduates

# 3. Connecting to the Database

Specifically, we'll work with the sqlite3 Python module, which was developed to work with SQLite version 3.

* We can import it into our environment using this command:

`import sqlite3`

* Once we import the module, we connect to the database we want to query using the `connect()` function. This function requires a single parameter, which is the database we want to connect to. Because the database we're working with exists as a file on disk, we need to pass in the file name.
`sqlite3.connect('jobs.db')`
  * The connect() function returns a Connection instance, which maintains the connection to the database we want to work with.

In [1]:
import sqlite3
conn=sqlite3.connect('jobs.db')

# 4. Introduction to Cursor Objects and Tuples

Before we can execute a query,` we need to express our SQL query as a string`. While `we use the Connection class `to represent the database we're working with, `we use the Cursor class to`:

* Run a query against the database
* Parse the results from the database
* Convert the results to native Python objects
* Store the results within the Cursor instance as a local variable

After running a query and converting the results to a list of tuples, the` Cursor instance stores the list as a local variable`. Before diving into the syntax of querying the database, let's revise some of what we previously learned about tuples.

# 5. Working With Sequences of Values as Tuples

A tuple is a core data structure that Python uses to represent a sequence of values, similar to a list. Unlike lists, tuples are immutable, which means we can't modify existing ones. `Python represents each row in the results set as a tuple.`

In [2]:
# To create an empty tuple, assign a pair of empty parentheses to a variable:
t = ()

In [3]:
# Python indexes Tuples from 0 to n-1, just like it does with lists. We access the values in a tuple using bracket notation.
t = ('Apple', 'Banana')
apple = t[0] 
banana = t[1]



**`Tuples are faster than lists, so they're helpful with larger databases and larger results sets.`**

# 6. Creating a Cursor and Running a Query

We need to use the `Connection instance method cursor() to return a Cursor instance `corresponding to the database we want to query.

`cursor = conn.cursor()`

## TODO:
* Write a query that returns all of the values in the Major column from the recent_grads table.
* Store the full results set (a list of tuples) in majors.
* Then, print the first three tuples in majors.

In [4]:
import sqlite3
conn=sqlite3.connect('jobs.db')

cursor=conn.cursor()  # connection instance method cursor

query="SELECT Major FROM recent_grads"

cursor.execute(query)

majors=cursor.fetchall()
print(majors[0:2])


# 7. Execute as a Shortcut for Running a Query

So far, we've run queries by creating a Cursor instance, and then calling the execute method on the instance. The SQLite library actually allows us to skip creating a Cursor altogether by using the execute method within the Connection object itself. SQLite will create a Cursor instance for us under the hood and run our query against the database, allowing us to skip a step. Here's what the code looks like:

In [5]:
conn=sqlite3.connect('jobs.db')
query="SELECT Major FROM recent_grads"
conn.execute(query).fetchall()

[('PETROLEUM ENGINEERING',),
 ('MINING AND MINERAL ENGINEERING',),
 ('METALLURGICAL ENGINEERING',),
 ('NAVAL ARCHITECTURE AND MARINE ENGINEERING',),
 ('CHEMICAL ENGINEERING',),
 ('NUCLEAR ENGINEERING',),
 ('ACTUARIAL SCIENCE',),
 ('ASTRONOMY AND ASTROPHYSICS',),
 ('MECHANICAL ENGINEERING',),
 ('ELECTRICAL ENGINEERING',),
 ('COMPUTER ENGINEERING',),
 ('AEROSPACE ENGINEERING',),
 ('BIOMEDICAL ENGINEERING',),
 ('MATERIALS SCIENCE',),
 ('ENGINEERING MECHANICS PHYSICS AND SCIENCE',),
 ('BIOLOGICAL ENGINEERING',),
 ('INDUSTRIAL AND MANUFACTURING ENGINEERING',),
 ('GENERAL ENGINEERING',),
 ('ARCHITECTURAL ENGINEERING',),
 ('COURT REPORTING',),
 ('COMPUTER SCIENCE',),
 ('FOOD SCIENCE',),
 ('ELECTRICAL ENGINEERING TECHNOLOGY',),
 ('MATERIALS ENGINEERING AND MATERIALS SCIENCE',),
 ('MANAGEMENT INFORMATION SYSTEMS AND STATISTICS',),
 ('CIVIL ENGINEERING',),
 ('CONSTRUCTION SERVICES',),
 ('OPERATIONS LOGISTICS AND E-COMMERCE',),
 ('MISCELLANEOUS ENGINEERING',),
 ('PUBLIC POLICY',),
 ('ENVIRONMENTA

`Notice that we didn't explicitly create a separate Cursor instance ourselves in this code example.`

# 8. Fetching a Specific Number of Results

To make it easier to work with large results sets, the Cursor class allows us to control the number of results we want to retrieve at any given time.` To return a single result (as a tuple), we use the Cursor method fetchone(). To return n results, we use the Cursor method `fetchmany()`.

Each Cursor instance contains an internal counter that updates every time we retrieve results. When we call the fetchone() method, the Cursor instance will return a single result, and then increment its internal counter by 1. This means that if we call fetchone() again, the Cursor instance will actually return the second tuple in the results set (and increment by 1 again).

The fetchmany() method takes in an integer (n) and returns the corresponding results, starting from the current position. It then increments the Cursor instance's counter by n. In the following code, we return the first two results using the fetchone() method, then the next five results using the fetchmany() method.

## TODO:
* Write and run a query that returns the Major and Major_category columns from recent_grads.
* Then, fetch the first five results and store them as five_results

In [6]:
con=sqlite3.connect('jobs.db')
cur=con.cursor()
query="SELECT Major,Major_category FROM recent_grads "
five_results=cur.execute(query).fetchmany(5)
five_results

[('PETROLEUM ENGINEERING', 'Engineering'),
 ('MINING AND MINERAL ENGINEERING', 'Engineering'),
 ('METALLURGICAL ENGINEERING', 'Engineering'),
 ('NAVAL ARCHITECTURE AND MARINE ENGINEERING', 'Engineering'),
 ('CHEMICAL ENGINEERING', 'Engineering')]

# 9. Closing the Database Connection

`To close a connection to a database, use the Connection instance method close().`

In [7]:
con.close()

## TODO:
* Connect to the database jobs2.db, which contains the same data as jobs.db.
* Write and execute a query that returns all of the majors (Major) in reverse alphabetical order (Z to A).
* Assign the full result set to reverse_alphabetical.
* Finally, close the connection to the database.

In [8]:
import sqlite3 
conn=sqlite3.connect('jobs2.db')
cursor=conn.cursor()
query="SELECT Major FROM recent_grads ORDER BY Major desc "
reverse_alphabetical=cursor.execute(query).fetchall()
conn.close()
reverse_alphabetical

[('ZOOLOGY',),
 ('VISUAL AND PERFORMING ARTS',),
 ('UNITED STATES HISTORY',),
 ('TREATMENT THERAPY PROFESSIONS',),
 ('TRANSPORTATION SCIENCES AND TECHNOLOGIES',),
 ('THEOLOGY AND RELIGIOUS VOCATIONS',),
 ('TEACHER EDUCATION: MULTIPLE LEVELS',),
 ('STUDIO ARTS',),
 ('STATISTICS AND DECISION SCIENCE',),
 ('SPECIAL NEEDS EDUCATION',),
 ('SOIL SCIENCE',),
 ('SOCIOLOGY',),
 ('SOCIAL WORK',),
 ('SOCIAL SCIENCE OR HISTORY TEACHER EDUCATION',),
 ('SOCIAL PSYCHOLOGY',),
 ('SECONDARY TEACHER EDUCATION',),
 ('SCIENCE AND COMPUTER TEACHER EDUCATION',),
 ('SCHOOL STUDENT COUNSELING',),
 ('PUBLIC POLICY',),
 ('PUBLIC ADMINISTRATION',),
 ('PSYCHOLOGY',),
 ('PRE-LAW AND LEGAL STUDIES',),
 ('POLITICAL SCIENCE AND GOVERNMENT',),
 ('PLANT SCIENCE AND AGRONOMY',),
 ('PHYSIOLOGY',),
 ('PHYSICS',),
 ('PHYSICAL SCIENCES',),
 ('PHYSICAL FITNESS PARKS RECREATION AND LEISURE',),
 ('PHYSICAL AND HEALTH EDUCATION TEACHING',),
 ('PHILOSOPHY AND RELIGIOUS STUDIES',),
 ('PHARMACY PHARMACEUTICAL SCIENCES AND ADMINIST