<h1 style="display:none;">Test</h1>
<h1 style="display:none;">Test</h1>
<h1 style="display:none;">Test</h1>
# Introduction to Databases: Query and Web Application Continued


## Scenario I -- Implementation Continued


### Overview

We will use Django and MySQL to implement the following GET REST paths for each resource.


- _/api/resource_name/primary-key-value_ to get by primary key.


- _/api/resource_name/metadata will return a JSON object describing
    - Fields and types
    - Allowable queries, which we can use to configure the web front-end for user defined query.


- _/api/resource?_ with the following HTTP query parameters
    - f='name1, name2, name3, ...' defining the fields (Projection)
    - A Select expression, which is of one of the the forms.
        - name1=value1&name2=value2&...
        - Q='< query string >'
        
        
- Why restrict queries and not support and expose full SQL capabilities?
    - Our table are small enough that naive, bad queries cannot hamper performance.
    - For large databases and tables sizes, you can write queries that
        - Run for hours or days.
        - Prevent execution of other queries from other users.
        - Effectively "taking the database down."
    - Database Administrator often restrict allowed queries to prevent performance problems.


### Implementation Approach

- Building the solutions is the core of HW 1 and the foundation for HW2.


- We will go through elements of the implementation in class, and I will handle as much of the non-DB enablement functions as possible, e.g. AngularJS.


- We will incrementally expand our knowledge of SQL and relational as we add functions.


## Continuing Implementation

### Conceptual Data Model

Reminder
<br><br>
<img src="../images/conceptuallogicalphysical.jpeg">

- There are three entities
    - _Master_ represents information about an individual in the database.
    - _Appearances_ represents information about a person's appearances in games.
    - _Batting_ represents information about a player's batting for teams and seasons.
   
   
- How do you identify entity types/entities? From http://www.agiledata.org/essays/dataModeling101.html
    - "An entity type, also simply called entity (not exactly accurate terminology, but very common in practice), is similar conceptually to object-orientation’s concept of a class – an entity type represents a collection of similar objects.  An entity type could represent a collection of people, places, things, events, or concepts. Examples of entities in an order entry system would include Customer, Address, Order, Item, and Tax. If you were class modeling you would expect to discover classes with the exact same names. However, the difference between a class and an entity type is that classes have both data and behavior whereas entity types just have data. 
    - "Ideally an entity should be normal, the data modeling world’s version of cohesive. A normal entity depicts one concept, just like a cohesive class models one concept. For example, customer and order are clearly two different concepts; therefore it makes sense to model them as separate entities." 


- How do you identity relationships?
    - "In the real world entities have relationships with other entities.  For example, customers PLACE orders, customers LIVE AT addresses, and line items ARE PART OF orders. Place, live at, and are part of are all terms that define relationships between entities.  The relationships between entities are conceptually identical to the relationships (associations) between objects."  
 

<img src="../images/L3_baseball_conceptual.jpeg">

### Logical Data Model

##### Overview

- The logical data model requires adding:
    - Attributes
    - Primary Keys
    - Foreign Keys
    
##### Attributes

Identifying attributes (http://www.agiledata.org/essays/dataModeling101.html)

- "Each entity type will have one or more data attributes.  For example, 
    - ... [a] Customer entity has attributes such as First Name and Surname and ... 
    - the TCUSTOMER table had corresponding data columns CUST_FIRST_NAME and CUST_SURNAME (a column is the implementation of a data attribute within a relational database). 
    
    
- Attributes should also be cohesive from the point of view of your domain, something that is often a judgment call. ... ... 
    - we decided that we wanted to model the fact that people had both first and last names instead of just a name (e.g. “Scott" and “Ambler" vs. “Scott Ambler")
    - we did not distinguish between the sections of an American zip code (e.g. 90210-1234-5678).
    
    
- Getting the level of detail right can have a significant impact on your development and maintenance efforts.
    - Refactoring a single data column into several columns can be difficult, ...
    - over-specifying an attribute (e.g. having three attributes for zip code when you only needed one) can result in overbuilding your system and hence you incur greater development and maintenance costs than you actually needed.
    

- In our scenario,
    - We were given the data, which partially defined the attributes.
    - We could have, and will, re-factor how the given data fits into a good data model.

<br><br>
    

<img src="../images/masterlogical.jpeg">
<br>
<img src="../images/battinglogical.jpeg" width="80%">
<img src="../images/appearanceslogical.jpeg">

##### Keys and Primary Keys

Ramakrishnan and Gehrke, 2.4.1, 3.2

_Relational Theory_

(Entity) keys refers to a set of attributes that uniquely defines an entity in an entity set. Entity keys can be _super,_ _candidate_ or _primary._
- _Super key:_ A set of attributes (one or more) that together define (uniquely identify) an entity in an entity set.
- _Candidate key:_ A minimal super key, meaning it has the least possible number of attributes to still be a super key. An entity set may have more than one candidate key.
- _Primary key:_ A candidate key chosen by the database designer to uniquely identify the entity set.

In our data model,
- _Master_ primary key is _playerID_
- _Batting_ is more complicated.
    - _playerID_ does not uniquely identify a row. Players play for many years.
    - _(playerID, yearID)_ does not uniquely identify a row. A player could get traded, and play for more than a single team in a year.
    - No problem, we can use _(playerID, yearID, teamID)._ But, a player can have more than one stint with a team in a year.
    - The answer is _(playerID, yearID, stint)._
        - How do I know this? I understand baseball.
        - What if you or I do not understand the domain? We are typically working with a domain expert and these decisions are part of collaborative design in the local modeling phase.
- _Appearances_ primary key is _(playerID, yearID, teamID)._

_An aside:_ I ran the following queries for batting
```
-- (1) What is the maximum number of rows for a given playerID? Also, look up the names. 
SELECT playerID,
	(SELECT nameLast FROM Master WHERE Master.playerID=Batting.playerID) as nameLast,
    (SELECT nameFirst FROM Master WHERE Master.playerID=Batting.playerID) as nameFirst,
    count(*) as row_count FROM batting GROUP BY playerID,nameFirst,nameLast
	ORDER BY row_count DESC LIMIT 1;

-- (2) What is the maximum number of rows if I try playerID and yearID for a primary key? Also, look up the names. 
SELECT playerID,
	(SELECT nameLast FROM Master WHERE Master.playerID=Batting.playerID) as nameLast,
    (SELECT nameFirst FROM Master WHERE Master.playerID=Batting.playerID) as nameFirst,
    count(*) as row_count FROM batting GROUP BY playerID, yearID
	ORDER BY row_count DESC LIMIT 1;

-- (3) Same question using playerID, yearID, teamID. Also, look up the names. 
SELECT playerID,
	(SELECT nameLast FROM Master WHERE Master.playerID=Batting.playerID) as nameLast,
    (SELECT nameFirst FROM Master WHERE Master.playerID=Batting.playerID) as nameFirst,
    count(*) as row_count FROM batting GROUP BY playerID, yearID, teamID
	ORDER BY row_count DESC LIMIT 1;

-- (4) Same question using playerID, yearID, strint. Also, look up the names. 
SELECT playerID,
	(SELECT nameLast FROM Master WHERE Master.playerID=Batting.playerID) as nameLast,
    (SELECT nameFirst FROM Master WHERE Master.playerID=Batting.playerID) as nameFirst,
    count(*) as row_count FROM batting GROUP BY playerID,yearID,stint
	ORDER BY row_count DESC LIMIT 1;
```

These queries returned the following information.

| Query No. | Possible Key             | playerID  | last name | first name | row count |
|-----------|--------------------------|-----------|-----------|------------|-----------|
| 1         | playerID                 | mcguide01 | McGuire   | Deacon     | 31        |
| 2         | playerID, yearID         | chouife01 | Chouinard | Felix      | 5         |
| 3         | playerID, yearID, teamID | chouife01 | Chouinard | Felix      | 3         |
| 4         | playerID, yearID, stint  | zay01     | Zay       | William    | 1         |

What do the queries do?
- For a possible key combination.
- Count the maximum number of rows that have any common combination of keys.
- Returns the largest count.
- And provides information about one of the rows with the largest count.

Do not worry if you do not understand these queries, _you will!_ But, the queries verify that _(playerID,yearID,stint)_ is uniquely identifies a row/entry.

##### Foreign Keys

Ramakrishan and Gehrke, section 3.2.2

"In the context of relational databases, a foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table or the same table. In simpler words, the foreign key is defined in a second table, but it refers to the primary key or a unique key in the first table." (https://en.wikipedia.org/wiki/Foreign_key)

There are at least two perspectives on _foreign key:_
1. Foreign keys implement _Integrity Constraint,_ which we will cover later. A tuple in one table can exists only if the foreign key matches a primary key in another table.
2. Foreign keys define _Relationships._ I can use a foreign key to find tuples in two different tables that are related.

In our simple example
- Batting.playerID is a foreign key for Master.playerID.
- Appearances.playerID is a foreign key for Master.playerID.

We will see this in more detail in future lectures.


### SQL Data Model (Our Initial Physical Model)

The physical model requires that we add the following:


- The create table DDL statement
    - Table names
    - Column names
    - Column data types
    
    
- Instead of drawing a diagram, we will do directly in SQL DDL.


- Note: We should adjust the column types and sizes.

```
CREATE TABLE `Master` (
  `playerID` varchar(255) NOT NULL,
  `birthYear` int(11) DEFAULT NULL,
  `birthMonth` int(11) NOT NULL,
  `birthDay` int(11) DEFAULT NULL,
  `birthCountry` varchar(255) DEFAULT NULL,
  `birthState` varchar(255) DEFAULT NULL,
  `birthCity` varchar(255) DEFAULT NULL,
  `deathYear` varchar(255) DEFAULT NULL,
  `deathMonth` varchar(255) DEFAULT NULL,
  `deathDay` varchar(255) DEFAULT NULL,
  `deathCountry` varchar(255) DEFAULT NULL,
  `deathState` varchar(255) DEFAULT NULL,
  `deathCity` varchar(255) DEFAULT NULL,
  `nameFirst` varchar(255) NOT NULL,
  `nameLast` varchar(255) NOT NULL,
  `nameGiven` varchar(255) DEFAULT NULL,
  `weight` int(11) DEFAULT NULL,
  `height` int(11) DEFAULT NULL,
  `bats` varchar(255) DEFAULT NULL,
  `throws` varchar(255) DEFAULT NULL,
  `debut` varchar(255) DEFAULT NULL,
  `finalGame` varchar(255) DEFAULT NULL,
  `retroID` varchar(255) DEFAULT NULL,
  `bbrefID` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`playerID`),
  KEY `player_idx` (`playerID`),
  KEY `name_l` (`nameLast`),
  KEY `name_f` (`nameFirst`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `Appearances` (
  `yearID` int(11) NOT NULL,
  `teamID` varchar(255) NOT NULL,
  `lgID` varchar(255) DEFAULT NULL,
  `playerID` varchar(255) NOT NULL,
  `G_all` int(11) DEFAULT NULL,
  `GS` varchar(255) DEFAULT NULL,
  `G_batting` int(11) DEFAULT NULL,
  `G_defense` int(11) DEFAULT NULL,
  `G_p` int(11) DEFAULT NULL,
  `G_c` int(11) DEFAULT NULL,
  `G_1b` int(11) DEFAULT NULL,
  `G_2b` int(11) DEFAULT NULL,
  `G_3b` int(11) DEFAULT NULL,
  `G_ss` int(11) DEFAULT NULL,
  `G_lf` int(11) DEFAULT NULL,
  `G_cf` int(11) DEFAULT NULL,
  `G_rf` int(11) DEFAULT NULL,
  `G_of` int(11) DEFAULT NULL,
  `G_dh` varchar(255) DEFAULT NULL,
  `G_ph` varchar(255) DEFAULT NULL,
  `G_pr` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`yearID`,`teamID`,`playerID`),
  UNIQUE KEY `ux` (`playerID`,`teamID`,`yearID`),
  KEY `player_idx` (`playerID`),
  KEY `year_idx` (`yearID`) USING BTREE,
  CONSTRAINT `playerID` FOREIGN KEY (`playerID`) REFERENCES `Master` (`playerID`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `Batting` (
  `playerID` varchar(255) NOT NULL,
  `yearID` int(11) NOT NULL,
  `stint` int(11) NOT NULL,
  `teamID` varchar(255) DEFAULT NULL,
  `lgID` varchar(255) DEFAULT NULL,
  `G` int(11) DEFAULT NULL,
  `AB` int(11) DEFAULT NULL,
  `R` int(11) DEFAULT NULL,
  `H` int(11) DEFAULT NULL,
  `2B` int(11) DEFAULT NULL,
  `3B` int(11) DEFAULT NULL,
  `HR` int(11) DEFAULT NULL,
  `RBI` int(11) DEFAULT NULL,
  `SB` int(11) DEFAULT NULL,
  `CS` int(11) DEFAULT NULL,
  `BB` int(11) DEFAULT NULL,
  `SO` int(11) DEFAULT NULL,
  `IBB` varchar(255) DEFAULT NULL,
  `HBP` varchar(255) DEFAULT NULL,
  `SH` varchar(255) DEFAULT NULL,
  `SF` varchar(255) DEFAULT NULL,
  `GIDP` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`playerID`,`yearID`,`stint`),
  CONSTRAINT `batting_player` FOREIGN KEY (`playerID`) REFERENCES `Master` (`playerID`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
 
```

- The core SQL column types are
<br><br>
<img src="../images/datatypes.jpg">


- All database management systems significantly extend the set of data/column types.


- The length options play a significant role in the database management systems optimization of storage use, which we will cover in future lectures.
    - INT(8), INT(11), TINY INT, ...
    - VARCHAR(16), VARCHAR(1024), ...
    

- If we have the DDL defined and have created the tables, we can _reverse engineer_ the physical model.


- The line endings have very precise meanings, which we will cover in future lectures.


<img src="../images/scenario1physical.jpeg">

### Web Resource Model -- Getting Started

We plan to start gathering displaying information about players. Our web resource model (URLs) with be
- /players?<query> to find players matching a template.
- /players/playerID to find a specific player.
    
_Example_

GET http://localhost:8000/baseball/api/players/napolmi01

Returns

```
{
    "playerID": "napolmi01",
    "birthYear": 1981,
    "birthMonth": "10",
    "birthDay": 31,
    "birthCountry": "USA",
    "birthState": "FL",
    "birthCity": "Hollywood",
    "deathYear": "",
    "deathMonth": "",
    "deathDay": "",
    "deathCountry": "",
    "deathState": "",
    "deathCity": "",
    "nameFirst": "Mike",
    "nameLast": "Napoli",
    "nameGiven": "Michael Anthony",
    "weight": 225,
    "height": 73,
    "bats": "R",
    "throws": "R",
    "debut": "2006-05-04",
    "finalGame": "2016-10-02",
    "retroID": "napom001",
    "bbrefID": "napolmi01"
}
```

<img src="../images/playergetpostman.jpeg">


### Overall Code Structure

- Django [first application tutorial](https://docs.djangoproject.com/en/2.0/intro/tutorial01/) is pretty good.


- I will set up as much of this as I can so that you can focus on the data access layer.


- Code Walkthrough (Will provide zip version)

### Find by Primary Key

#### Overview

- The primary keys are
    - Players (Master) = playerID
    - Batting is a compound key (playerID, yearID, stint)
    - Appearances is (playerID, yearID, teamID)


- The format for the GET by primary key path is /api/baseball/$<resource \ type>$/$<primary\ key\ value>$ 


#### Implementation Step 1 -- Primary Keys.

Well, we know the implementing function looks something like


In [4]:
def find_by_primary_key(resource,key_name,key_value):
    cnx=connect()
    cursor=cnx.cursor()
    q = "SELECT * FROM " + " " + resource + " " 
    q = q + " WHERE " + key_name + " = '" + key_value + "';"
    print("Query = ", q)
    cursor.execute(q);
    r = cursor.fetchone()
    return r

# Just test code below so function executes in Jupyter for presentation.
r = find_by_primary_key("Master","playerID","willite01")
pretty_print(r)

Query =  SELECT * FROM  Master  WHERE playerID = 'willite01';


NameError: name 'pretty_print' is not defined

- There are a few issues that we need to work out:
    - Error handling and error codes.
    - Better data mapping, specifically
        - "" is an artifact of data import. Text files cannot represent NULL.
        - Dates: 
            - birth and death handled differently from finalGame and debut.
            - finalGame and debut are VARCHAR in database, not MySQL DATETIME type.
    - Compound keys are a more complex problem
        - /api/players/napolmi01 is fine for single column keys.
        - What do we do for the compound key (playerID, year, stint) for Batting?
        - We could use query params, but this would result in inconsistent URL patterns for resources, and break out model that there are resource collections that contain individually identifiable resource elements.
        - [Resource instances having a unique ID relative](https://cloud.google.com/apis/design/resource_names) to the set of resources is a best practice.
        - We will use a delimeter and arbitrarily choose "-". This yields /api/batting/willite01-1960-1.
    - We will handle some of the issues, but this is not a course on Python, web UI, HTTP/HTML types, etc.


- There are [design patterns](https://en.wikipedia.org/wiki/Software_design_pattern) for dealing with these (and other) issues.
    - We will use some elements of some design patterns, but not be rigorous.
    - A simplified [data access object pattern](https://en.wikipedia.org/wiki/Data_access_object) will be useful because our projects will access multiple datasources in the future.
    

<img src="../images/L4_BO_DO.png" width="60%">

- The _business object_ implements the application's behavior, correctness, etc.


- The _data access object_ isolates business logic from
    - Schema change and evolution. (We will clean up the schema over time)
    - Differences between databases. Business logic developers focus on the application, not specifics of
        - MySQL versus Oracle versus DB2.
        - Data implementation choices, e.g. relational versus key-value versus graph.
    - There are frameworks for the DAO pattern, e.g.
        - [ADO.NET](https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/ado-net-overview)
        - [OData](http://www.odata.org/)
        - [Django Models](https://docs.djangoproject.com/en/2.0/topics/db/models/)
        
        
- Again, this is not a web application or design patterns class. We will do simple approaches.
        

In [None]:
# Implement a very simple data access object pattern
# We will not use types, classes, metadata, etc.
# We will simple use dictionaries.
import pymysql.cursors
import json
from datetime import datetime

def debug_message(s, o):
    print(str(datetime.now()) + ": " + s)
    if (o != None):
        print(json.dumps(o,indent=2,sort_keys=True))

# Security is a complex topic, which we will cover later.
# NEVER put security credentials in files/code.
def connect():
    connection = pymysql.connect(host='localhost',
                                 user='dbuser',
                                 password='dbuser',
                                 db='lahman2016',
                                 charset='utf8mb4',
                                 cursorclass=pymysql.cursors.DictCursor)
    return connection

def disconnect(c):
    c.close()


# This is an abstraction. We map
# - entity_set to table.
# - key to the primary key: This will come in as a string, and
#   may map to a compound key.
#
def find_by_id(entity_set, key):
    try:

        connection = connect()
        result = {"data": "Not Found"}

        with connection.cursor() as cursor:

            mapped_info = map_entity_set_key(entity_set, key)
            sql = generate_select_statement(mapped_info)
            debug_message("SQL = " + sql, None)
            cursor.execute(sql)
            result = cursor.fetchone()
            debug_message("Result = ", result)
            #print(result)
    finally:
        disconnect(connection)

    return result


# We will handle this extensibly with metadata later. For now
# we simple hard code. Returns the table and a dictionary of
# column: value needed for the primary key
#
def map_entity_set_key(entity_set, key):

    r = {}
    done = False

    if (entity_set == "players"):
        r = {
            "table": "players",
            "columns" : ["playerID"],
            "values": [key],
            "types": ["s"]
        }
        done = True

    if (entity_set == "batting"):
        s = key.split("-")
        print("s = ",s)
        r = {
            "table": "batting",
            "columns" : [ "playerID", "yearID", "stint" ],
            "values" : [s[0], s[1], s[2]],
            "types" : ["s", "s", "i"]
        }
        done = True

    if done == False:
        r = None

    return r

# Input is a entity plus (column, value) template.
# Output is a query string

def generate_select_statement(map_info):
    s = "SELECT * FROM " + map_info.get("table") + " WHERE "
    w = ""

    columns = map_info.get("columns")
    values = map_info.get("values")
    types = map_info.get("types")

    print("columns = ", columns)

    for i in range(0,len(columns)):
        c = columns[i]
        v = values[i]
        t = types[i]

        if w != "" :
            w = w + " AND "

        w = w + c + "="
        if t == "s":
            w = w + "'" + v + "'"
        else:
            w = w + v

    return s+w

e1 = find_by_id("batting","willite01-1960-1")
e2 = find_by_id("players","willite01")
debug_message("\n\nFind by key willite01-1960-1 in batting returned ", e1)
debug_message("\n\nFind by key willite01 in players returned ", e2)

__Now you can see why people use frameworks.__ This can get tedious.


### Implementation Step 2: More Complex Paths -- Part of HW 2

- /api/batting/willite01-1960-1 is the URL to a specific resource.


- Before going to more general query, we should also support the paths
    - /api/batting/willite01 to return all of the the players batting records.
    - /api/batting/willite01/1960 to return the players records from 1960.
    
    
- This is an interim step general query.


- We will cover in more detail later in the semester, but if (playerID, yearID, stint) is a compound key, then the DB engine implicitly creates keys
    - playerID,
    - (playerID, yearID)



### Implementation Step 3: Subqueries/Nested Queries -- Part of HW2

#### Overview

Ramkrishan and Gehrke, section 5.4

<img src="../images/L4_subquery.jpeg">

Additional perspective (https://www.w3resource.com/sql/subqueries/understanding-sql-subqueries.php)

- A subquery is a SQL query nested inside a larger query.


- A subquery may occur in :
    - A SELECT clause
    - A FROM clause
    - A WHERE clause
    
    
- The subquery can be nested inside a SELECT, INSERT, UPDATE, or DELETE statement or inside another subquery.


- A subquery is usually added within the WHERE Clause of another SQL SELECT statement.


- You can use the comparison operators, such as >, <, or =. The comparison operator can also be a multiple-row operator, such as IN, ANY, or ALL.


- A subquery is also called an inner query or inner select, while the statement containing a subquery is also called an outer query or outer select.


- The inner query executes first before its parent query so that the results of an inner query can be passed to the outer query.


#### Examples

The query 

```
SELECT * FROM lahman2016.Batting
	WHERE playerID = 'willite01';
```

Returns

<img src="../images/L4_subquery_example.jpeg">

- This is useful but presumes that you know who 'willite01' is.


- You may not know who willite01 is if I email you the link /api/batting/willite01 and say "Key, look at this."


A better query might be

```
SELECT
	(SELECT nameLast FROM Master WHERE Master.playerID=Batting.playerID) as last_name,
    (SELECT nameFirst FROM Master WHERE Master.playerID=Batting.playerID) as first_name,
    Batting.*
    FROM lahman2016.Batting
	WHERE playerID = 'willite01';
```

Which produces

<img src="../images/L4_subquery_example_2.jpeg">

- This is a bit contrived but demonstrates the key concept. We sometimes
    - Want to "look up" something in another table A.
    - When evaluating a query on table B.
    - And find the thing in A based on something in the results set for B, or from query inputs.
    

### Implementation Step 4: Group By and Aggregation

- This is just a preview of the concept.


- What if I want to know the _career information,_ not individual season information?


- I have to find all the batting tuples for a player and put in a group, and then apply aggregate functions.


- The query

```
SELECT
	(SELECT nameLast FROM Master WHERE Master.playerID=Batting.playerID) as last_name,
    (SELECT nameFirst FROM Master WHERE Master.playerID=Batting.playerID) as first_name,
    Batting.playerID,
    sum(Batting.g) as total_games,
    sum(Batting.ab) as total_at_bats,
    sum(Batting.h) as total_hits,
    sum(Batting.h)/sum(Batting.ab) * 100 as career_average
FROM
	lahman2016.Batting
WHERE
	playerID = 'willite01'
GROUP BY playerID, last_name, first_name;
```

Returns

<img src="../images/L4_aggregate_1.jpeg">


### Implementation Step 5: Order By and Limits

- What if I want to compare players based on aggregation and find the 10 best?


- The query

```
SELECT
	(SELECT nameLast FROM Master WHERE Master.playerID=Batting.playerID) as last_name,
    (SELECT nameFirst FROM Master WHERE Master.playerID=Batting.playerID) as first_name,
    Batting.playerID,
    sum(Batting.g) as total_games,
    sum(Batting.ab) as total_at_bats,
    sum(Batting.h) as total_hits,
    sum(Batting.h)/sum(Batting.ab) * 1000 as career_average
FROM
	lahman2016.Batting
GROUP BY
	playerID, last_name, first_name
HAVING
	total_at_bats > 500
order by
	career_average DESC
LIMIT 10;
```

Returns

<img src="../images/L4_best_hitters.jpeg">


### Summary -- Implementation and Next HW

- This has been a lot to absorb.


- We have begun to see how powering a website with a relational database enables sophisticated question/answer and reports for multiple users.


- We will start fleshing this out in the next lecture.