# Mongo Examples Part 2
### working with a larger data set

Sample Data from [United States Department of Transportation On-Time Performance](https://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time)

Imported to Mongo using [MongoImport](https://docs.mongodb.com/manual/reference/program/mongoimport/)

> mongoimport -d flights -c flightstats --type csv --file 642648472_T_ONTIME.csv --headerline


In [1]:
use flights

switched to db flights

**Grab the first record (find all, limit to one) and make it look pretty**

**find() takes two parameters, query and projection**
+ query is like the where clause
+ projection is the fields to be returnd

**We start with both empty to get everything** _( find({},{}) = find({}) )_

In [2]:
db.flightstats.find({}).limit(1).pretty();

** We don't need all of those fields, just limit to the few care about **

In [3]:
db.flightstats.find(
    {},
    {
        "FL_DATE" : 1, "FL_NUM" : 1, "ORIGIN_CITY_NAME" : 1, 
        "DEST_CITY_NAME" : 1, "DEST_STATE_ABR" : 1, "DEST_STATE_NM" : 1, 
        "DEP_TIME" : 1, "ARR_TIME" : 1, "CANCELLED" : 1, 
        "DIVERTED" : 1, "ACTUAL_ELAPSED_TIME" : 1, "AIR_TIME" : 1
    }).limit(1).pretty();

{
	"_id" : ObjectId("59e9449672df5c7b88add6a1"),
	"FL_DATE" : "2017-06-05",
	"FL_NUM" : 4768,
	"ORIGIN_CITY_NAME" : "South Bend, IN",
	"DEST_CITY_NAME" : "Detroit, MI",
	"DEST_STATE_ABR" : "MI",
	"DEST_STATE_NM" : "Michigan",
	"DEP_TIME" : 1927,
	"ARR_TIME" : 2021,
	"CANCELLED" : 0,
	"DIVERTED" : 0,
	"ACTUAL_ELAPSED_TIME" : 54,
	"AIR_TIME" : 31
}

** Drop the _id by setting it to 0 **

In [4]:
db.flightstats.find(
    {},
    {_id : 0,
        "FL_DATE" : 1, "FL_NUM" : 1, "ORIGIN_CITY_NAME" : 1, 
        "DEST_CITY_NAME" : 1, "DEST_STATE_ABR" : 1, "DEST_STATE_NM" : 1, 
        "DEP_TIME" : 1, "ARR_TIME" : 1, "CANCELLED" : 1, 
        "DIVERTED" : 1, "ACTUAL_ELAPSED_TIME" : 1, "AIR_TIME" : 1
    }).limit(10).pretty();

{
	"FL_DATE" : "2017-06-05",
	"FL_NUM" : 4768,
	"ORIGIN_CITY_NAME" : "South Bend, IN",
	"DEST_CITY_NAME" : "Detroit, MI",
	"DEST_STATE_ABR" : "MI",
	"DEST_STATE_NM" : "Michigan",
	"DEP_TIME" : 1927,
	"ARR_TIME" : 2021,
	"CANCELLED" : 0,
	"DIVERTED" : 0,
	"ACTUAL_ELAPSED_TIME" : 54,
	"AIR_TIME" : 31
}
{
	"FL_DATE" : "2017-06-05",
	"FL_NUM" : 4769,
	"ORIGIN_CITY_NAME" : "La Crosse, WI",
	"DEST_CITY_NAME" : "Minneapolis, MN",
	"DEST_STATE_ABR" : "MN",
	"DEST_STATE_NM" : "Minnesota",
	"DEP_TIME" : 1122,
	"ARR_TIME" : 1214,
	"CANCELLED" : 0,
	"DIVERTED" : 0,
	"ACTUAL_ELAPSED_TIME" : 52,
	"AIR_TIME" : 31
}
{
	"FL_DATE" : "2017-06-05",
	"FL_NUM" : 4769,
	"ORIGIN_CITY_NAME" : "Minneapolis, MN",
	"DEST_CITY_NAME" : "La Crosse, WI",
	"DEST_STATE_ABR" : "WI",
	"DEST_STATE_NM" : "Wisconsin",
	"DEP_TIME" : 1000,
	"ARR_TIME" : 1101,
	"CANCELLED" : 0,
	"DIVERTED" : 0,
	"ACTUAL_ELAPSED_TIME" : 61,
	"AIR_TIME" : 26
}
{
	"FL_DATE" : "2017-06-05",
	"FL_NUM" : 4

** How many records in June **

In [5]:
db.flightstats.find(
    {
        "MONTH": {$eq: 6}
    }).count()

494266

** How many flights are recorded on the 15th of the month **
_ (note that a set of items are in \[ & \] ) _

In [6]:
db.flightstats.find(
    { 
        $and: [
            {"MONTH": {$eq: 6}}, 
            {"DAY_OF_MONTH": 
            {$eq: 15}}
        ]
    }).count()

17061

** How many were going to Ohio? **

In [7]:
db.flightstats.find(
    { 
        $and: [
            {"DEST_STATE_ABR": {$eq: "OH"}},
            {"MONTH": {$eq: 6}}, 
            {"DAY_OF_MONTH": {$eq: 15}}
        ]}
    ).count()

219

** How many of those were early? **

In [8]:
db.flightstats.find({ 
    $and: [
        {"DEST_STATE_ABR": {$eq: "OH"}},
        {"MONTH": {$eq: 6}}, 
        {"DAY_OF_MONTH": {$eq: 15}},
        {"ARR_DELAY": {$lt: 0}}
    ]}
).count()

95

** Now let's look at those records **

In [9]:
db.flightstats.find(
    { 
        $and: [
            {"DEST_STATE_ABR": {$eq: "OH"}},
            {"MONTH": {$eq: 6}}, 
            {"DAY_OF_MONTH": {$eq: 15}},
            {"ARR_DELAY": {$lt: 0}}
        ]}
    ).sort({"ARR_DELAY": 1}).forEach( function(theFlight)  { 
            print( "Flight " + theFlight.FL_NUM + ", on " + theFlight.FL_DATE + " from " +
                theFlight.ORIGIN_CITY_NAME + " to " + theFlight.DEST_CITY_NAME + " was " +
                (theFlight.ARR_DELAY * -1) + " minutes early"); 
    } )


Flight 1913, on 2017-06-15 from Los Angeles, CA to Cleveland, OH was 33 minutes early
Flight 3954, on 2017-06-15 from New York, NY to Cleveland, OH was 32 minutes early
Flight 122, on 2017-06-15 from Myrtle Beach, SC to Akron, OH was 26 minutes early
Flight 1422, on 2017-06-15 from Portland, OR to Cleveland, OH was 26 minutes early
Flight 365, on 2017-06-15 from Denver, CO to Cleveland, OH was 26 minutes early
Flight 5180, on 2017-06-15 from New York, NY to Cleveland, OH was 26 minutes early
Flight 5259, on 2017-06-15 from Minneapolis, MN to Cleveland, OH was 26 minutes early
Flight 578, on 2017-06-15 from New Orleans, LA to Cleveland, OH was 25 minutes early
Flight 5914, on 2017-06-15 from Chicago, IL to Dayton, OH was 24 minutes early
Flight 4336, on 2017-06-15 from Boston, MA to Cleveland, OH was 24 minutes early
Flight 1367, on 2017-06-15 from Atlanta, GA to Cleveland, OH was 23 minutes early
Flight 4904, on 2017-06-15 from Minneapolis, MN to Dayton, OH was 22 minutes ea