-
Notifications
You must be signed in to change notification settings - Fork 0
/
danl-200-team-project.qmd
131 lines (98 loc) · 3.88 KB
/
danl-200-team-project.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
title: Team Project
subtitle: What You Should Do for the Team Project
author: Byeong-Hak Choe
institute: SUNY Geneseo
date: 2024-04-25
format:
html
# toc: true
# toc-depth: 2
# toc-expand: true
# toc-title: Contents
code-fold: false
execute:
echo: true
eval: true
message: false
warning: false
fig-width: 9
fig-height: 6
---
```{r}
#| include: false
library(tidyverse)
library(skimr)
library(broom)
library(hrbrthemes)
theme_set(theme_ipsum() +
theme(strip.background =element_rect(fill="lightgray"),
axis.title.x = element_text(size = rel(1.5) ),
axis.title.y = element_text(size = rel(1.5) ))
)
```
<br>
## Team Project
- Publish the webpage of your team's data analysis on **each team member's website**, hosted in GitHub.
- The due for the project is May 16, 2024, Thursday, 11:59 P.M.
- Please notify Byeong-Hak Choe if any team member fails to participate in the project by May 8, 2024. You can report this issue by emailing [bchoe@geneseo.edu](mailto:bchoe@geneseo.edu) after May 8, 2024.
<br>
## Project Data
- For the project, a team must choose one of the following data.frames:
- `beer_markets`
- `nyc_housing_sales`
### Beer Market Data
```{r}
beer_markets <- read_csv("https://bcdanl.github.io/data/beer_markets_all.csv")
```
```{r}
#| echo: false
#| results: asis
rmarkdown::paged_table(beer_markets)
```
#### Variable Description
- `hh`: an identifier of the household;
- `X_purchase_desc`: details on the purchased item;
- `quantity`: the number of items purchased;
- `brand`: Bud Light, Busch Light, Coors Light, Miller Lite, or Natural Light;
- `dollar_spent`: total dollar value of purchase;
- `beer_floz`: total volume of beer, in fluid ounces;
- `price_per_floz`: price per fl.oz. (i.e., beer spent/beer floz);
- `container`: the type of container;
- `promo`: Whether the item was promoted (coupon or otherwise);
- `market`: Scan-track market (or state if rural);
- `state`: US State
- demographic data, including gender, marital status, household income, class of work, race, education, age, the size of household, and whether or not the household has a microwave or a dishwasher.
<br>
### NYC Housing Sales Data
```{r}
nyc_housing_sales <- read_csv("https://bcdanl.github.io/data/nyc_housing_sales_2006-2023.csv")
```
```{r}
#| echo: false
#| results: asis
rmarkdown::paged_table(nyc_housing_sales)
```
#### Variable Description
- For the description of variables in the `nyc_housing_sales` data.frame, please refer to the following webpage:
- [https://www.nyc.gov/site/finance/property/glossary-property-sales.page](https://www.nyc.gov/site/finance/property/glossary-property-sales.page)
- For the variables of building class code, please refer to the following webpage:
- [https://www.nyc.gov/assets/finance/jump/hlpbldgcode.html](https://www.nyc.gov/assets/finance/jump/hlpbldgcode.html)
<br>
## Key Components in the Project
- Below are the key components in the project.
1. **Title**: A clear and concise title that gives an idea of the project topics.
2. **Introduction**:
- **Background**: Provide context for the research topics, explaining why they are significant or relevant.
- **Statement of the Problem**: Clearly articulate the specific problem or issue the project will address.
3. **Exploratory Data Analysis**:
- List the questions your team aims to answer.
- Address the questions through summary statistics, data visualization, and data transformation.
4. **Significance of the Project**:
- Explain its implications for real-world applications, business strategies, or public policy.
6. **References**
- List all sources cited in the project.
<br>
## Rubric for the Project
- Below links to the PDF file of the rubric for the team project:
- [Link](https://drive.google.com/file/d/1O2pK4vK2bd1ZP8nnlPOwNr1oAeAbuHHQ/view?usp=share_link)