__Which state is the best in baseball?__ Since only states have two letter keywords, lets get the total home runs per state by all players. 

Write a function `total_home_runs` that takes a string `state` as argument and returns the total number of home runs of all players from this state. Next, query the `batting` table to obtain a `pandas.Series` object `birthState` containing unique copies of all `state`s in that table. Finally, use a suitable parallelization technique to call `total_home_runs` with each entry of `states`. 

In [27]:
import sqlite3 as sql
import pandas as pd

In [28]:
db = sql.connect('../data/lahmansbaseballdb.sqlite', check_same_thread=False)

In [29]:
states = pd.read_sql('''
    SELECT DISTINCT birthState 
    FROM people
    WHERE LENGTH(birthstate)=2
    ORDER BY birthState
''' , db)['birthState']
states.head(3)

0    AB
1    AK
2    AL
Name: birthState, dtype: object

In [30]:
def total_home_runs(state): 
    '''Returs the total number of home runs of all players from that state'''
    query = '''
    SELECT SUM(hr) AS total, birthState
    FROM batting AS b
    LEFT JOIN people AS p
    ON b.playerid = p.playerid
    WHERE birthState = ''' + "'" + state + "'" + '''
    '''
    return(pd.read_sql(query , db))
total_home_runs('AB')

Unnamed: 0,total,birthState
0,26,AB


_(yes, that query could be improved to the point of not needing concurrency - see below -, this is just for demonstration...)_

In [31]:
import concurrent.futures, threading

with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        result = executor.map(total_home_runs, states)
        

In [32]:
homeruns = [i.iloc[0] for i in result]

In [33]:
pd.DataFrame(homeruns).set_index('birthState').squeeze().sort_values(ascending=False).head(3)

birthState
CA    52477
TX    15449
FL    14918
Name: total, dtype: int64

Yay California! We should account for the number of players in the data set. This time, we can ignore concurrency. 

In [39]:
pd.read_sql('''
    SELECT AVG(hr) AS total, birthState
    FROM batting AS b
    LEFT JOIN people AS p
    ON b.playerid = p.playerid
    GROUP BY birthState
    HAVING LENGTH(birthstate)=2
    ORDER BY total DESC
    ''', db).set_index('birthState').squeeze().head(3)

birthState
MB    6.944444
NM    5.842767
BC    5.151659
Name: total, dtype: float64

[MB](https://en.wikipedia.org/wiki/Manitoba), [BC](https://en.wikipedia.org/wiki/British_Columbia). Maybe <span>&#x1f1e8;&#x1f1e6;</span> is better in baseball than <span>&#x1f1fa;&#x1f1f8;</span>? 