This notebook uses ipython-sql (pip install ipython-sql) to type SQL queries directly. Use %sql before a single-line query and %%sql before a multi-line query.

In [1]:
%load_ext sql

In [16]:
%sql postgresql://nwespe@localhost/birth_db

u'Connected: nwespe@birth_db'

In [18]:
# use SET search_path TO schema_name if schema is used
%sql reset search_path

Done.


[]

In [19]:
%%sql
SELECT * FROM birth_data_table # * means select all columns
LIMIT 10  # only return 10 rows
OFFSET 50  # start at row 50

10 rows affected.


index,alcohol_use,anencephaly,attendant,birth_loc_type,birth_month,birth_state,birth_weight,birth_year,cigarette_use,cigarettes_per_day,cigarettes_trimester1,cigarettes_trimester2,cigarettes_trimester3,day,delivery_method,downs syndrome,drinks_per_week,father_age,father_race,gestation_weeks,infant_sex,mother_age,mother_birth_country,mother_birth_state,mother_education,mother_marital_status,mother_race,mother_state,population,pregnancy_weight,resident,revision,spina_bifida,table,timestamp,uses_tobacco,weight_gain
50,,,MD,,Jan,,5000.0,2012,,,,,,Fri,Vaginal,,,,,42.0,F,20,,,,Yes,White,,,,Resident,S,,births12.txt,1325456800,,30.0
51,,,MD,,Jan,,3000.0,2012,,,,,,Fri,Vaginal,,,,,42.0,F,20,,,,Yes,White,,,,Resident,S,,births12.txt,1326371490,,32.0
52,,,MD,,Jan,,3000.0,2012,,,,,,Sun,Vaginal,,,,,38.0,F,21,,,,No,White,,,,Resident,S,,births12.txt,1327245443,,24.0
53,,,MD,,Feb,,4000.0,2012,,,,,,Wed,Cesarean,,,,,39.0,M,35,,,,Yes,White,,,,Intra-State/Territor Non-resident (diff county),S,,births12.txt,1328148996,,30.0
54,,,MD,,Feb,,3000.0,2012,,,,,,Wed,Vaginal,,,,,38.0,F,29,,,,Yes,White,,,,Resident,S,,births12.txt,1328933795,,34.0
55,,,MD,,Feb,,2000.0,2012,,,,,,Sat,Vaginal,,,,,31.0,M,27,,,,No,Black,,,,Intra-State/Territor Non-resident (diff county),S,,births12.txt,1329753116,,25.0
56,,,MD,,Mar,,4000.0,2012,,,,,,Tu,Vaginal,,,,,40.0,F,18,,,,No,White,,,,Resident,S,,births12.txt,1330659032,,50.0
57,,,MD,,Mar,,4000.0,2012,,,,,,Sun,Vaginal,,,,,38.0,F,29,,,,Yes,Black,,,,Resident,S,,births12.txt,1331541002,,50.0
58,,,MD,,Mar,,4000.0,2012,,,,,,Mon,Vaginal,,,,,39.0,F,29,,,,No,White,,,,Resident,S,,births12.txt,1332469424,,31.0
59,,,MD,,Apr,,4000.0,2012,,,,,,Sat,Vaginal,,,,,39.0,M,30,,,,No,White,,,,Resident,S,,births12.txt,1333389308,,3.0


What is the most common day for a delivery?

Let's first create a view containing the days of the week and the count for each day

In [43]:
%%sql 
CREATE VIEW day_counts AS
SELECT day, COUNT(*) AS count
FROM birth_data_table
GROUP BY day

(psycopg2.ProgrammingError) relation "day_counts" already exists
 [SQL: 'CREATE VIEW day_counts AS\nSELECT day, COUNT(*) AS count\nFROM birth_data_table\nGROUP BY day']


In [44]:
%sql SELECT * FROM day_counts

7 rows affected.


day,count
Tu,449
Sat,307
Mon,403
Wed,415
Sun,281
Th,432
Fri,429


We can view this table ordered by count.

In [46]:
%%sql 
SELECT * 
FROM day_counts 
ORDER BY count DESC  # order in descending

7 rows affected.


day,count
Tu,449
Th,432
Fri,429
Wed,415
Mon,403
Sat,307
Sun,281


For a solution with one query and one sub-query, see below. Note that this solution creates a temporary table instead of a view.

In [62]:
%%sql
SELECT day
FROM (SELECT day, COUNT(*) AS count  # the parentheses enclose a subquery
      FROM birth_data_table
      GROUP BY day
     ) AS day_counts  # name the temporary table
WHERE count = (SELECT MAX(count) FROM day_counts)

1 rows affected.


day
Tu
