# Documentation and Cleaning Assesment: <i> Local Law 7-2018 Qualified Transactions </i>

## Table of Contents: 

[Documentation](#Documentation)

- [Attribution](#Attribution)
- [Semantic Contents](#Semaintic-Contents)
- [Collection Process](#Collection-Process)
- [Data Structure](#Data-Structure)

[Cleaning Assesment (by column name)](#Cleaning-Assesment)

- [bbl](#bbl)
- [block](#block)
- [boro](#boro)
- [borough](#borough)
- [borough_cap_rate](#borough_cap_rate)
- [cap_rate](#cap_rate)
- [crfn](#crfn)
- [deed_date](#deed_date)
- [grantee](#grantee)
- [hnum_hi](#hnum_hi)
- [hnum_lo](#hnum_lo)
- [lot](#lot)
- [price](#price)
- [str_name](#str_name)
- [swl](#swl)
- [yearqtr](#yearqtr)

### Datasource Name: 

Local Law 7-2018 Qualified Transactions

## Documentation

### Attribution:
New York City Department of Housing Preservation and Development (HPD) and
New York City Department of Finance 
[Link to source (as of 3-31-19)](https://data.cityofnewyork.us/Housing-Development/Local-Law-7-2018-Qualified-Transactions/8wi4-bsy4)

### Semantic Contents 
#### Description:
This is a record of the sales of buildings that are at least 50% regulated by unit count. 
#### Selection:
The following columns were generated or kept:
- BBL: building identification number. Datatype: int
- price: sale price in dollars. Datatype: int
- cap_rate: capitilization rate (annual income/sale price). Datatype: float64	
- borough_cap_rate: the median cap rate for the county. Datatype: float64
- Latitude: building latitude. Datatype: float64
- Longitude:	building longitude. Datatype: float64
- BIN : building idenitfication number (alternative fk). Datatype: string
- deed_date	: date on the deed. Datatype: numpy datetime object
- watchlist: watchlist status. Datatype: bool (1.0/0.0)
#### Date timeframe:
Data Last Updated
March 29, 2019
Metadata Last Updated
March 29, 2019
Date Created
November 2, 2018


### Collection Process

#### How the data was gathered:
NYC Open Data https://data.cityofnewyork.us/Housing-Development/Local-Law-7-2018-Qualified-Transactions/8wi4-bsy4
### Data Structure
See Selection above

## Cleaning Assesment

---

[Table of Contents](#Table-of-Contents)
### bbl
> #### Column Description
Per Documentation:
" Borough, block, and lot "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: value = (boro).concat(block, 5).concat(lot, 4)
> #### Estimated Time Required
1 min to design and implement; on fail: inspect
> #### Programs Required
OpenRefine
> #### Hand edits

> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### block
> #### Column Description
Per Documentation:
" Tax block "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: None (covered by bbl)
> #### Estimated Time Required

> #### Programs Required

> #### Hand edits

> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### boro
> #### Column Description
Per Documentation:
" Borough where property is located (numeric) "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: None (covered by bbl)
> #### Estimated Time Required

> #### Programs Required

> #### Hand edits
> created BBL of concat of constituent elements. 
> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### borough
> #### Column Description
Per Documentation:
" Borough where property is located (text) "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: None (covered by bbl)
> #### Estimated Time Required

> #### Programs Required

> #### Hand edits

> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### borough_cap_rate
> #### Column Description
Per Documentation:
" Borough capitalization rate "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: is float
> #### Estimated Time Required
to design and implement; on fail: inspect
> #### Programs Required
OpenRefine
> #### Hand edits
> changed datatype to float64
> created caluclated column for presence on watch list by department rule (if building cap rate is less than mean boro cap rate)
> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### cap_rate
> #### Column Description
Per Documentation:
" Capitalization rate "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: is float
> #### Estimated Time Required
to design and implement; on fail: inspect
> #### Programs Required
OpenRefine
> #### Hand edits
> changed datatype to float64
> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### crfn
> #### Column Description
Per Documentation:
" City Register file number "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: is int
> #### Estimated Time Required
to design and implement; on fail: inspect
> #### Programs Required
OpenRefine
> #### Hand edits

> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### deed_date
> #### Column Description
Per Documentation:
" Execution date of transfer "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: is date
> #### Estimated Time Required
to design and implement; on fail: inspect
> #### Programs Required
OpenRefine
> #### Hand edits
>changed datatype to numpy date object
> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### grantee
> #### Column Description
Per Documentation:
" Name of buyer(s) "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: has value
> #### Estimated Time Required
to design and implement; on fail: inspect 
> #### Programs Required
OpenRefine
> #### Hand edits

> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### hnum_hi
> #### Column Description
Per Documentation:
" Portion of street address: high house number "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: is num or -
> #### Estimated Time Required
to design and implement; on fail: inspect 
> #### Programs Required
OpenRefine
> #### Hand edits

> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### hnum_lo
> #### Column Description
Per Documentation:
" Portion of street address: low house number "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: is num or -
> #### Estimated Time Required
to design and implement; on fail: inspect
> #### Programs Required
OpenRefine
> #### Hand edits

> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### lot
> #### Column Description
Per Documentation:
" Tax lot "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation: None (covered by bbl)
> #### Estimated Time Required

> #### Programs Required

> #### Hand edits

> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### price
> #### Column Description
Per Documentation:
" Price "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation:
> #### Estimated Time Required
to design and implement; on fail:
> #### Programs Required
OpenRefine
> #### Hand edits
> changed datatype to int64


> #### Other Notes
---

[Table of Contents](#Table-of-Contents)
### str_name
> #### Column Description
Per Documentation:
" Portion of street address: street name "
> #### Cleaning Actions Needed
Check Integrity Constraint Violation:
> #### Estimated Time Required
to design and implement; on fail:
> #### Programs Required
OpenRefine
> #### Hand edits

> #### Other Notes
---


---

