-
Notifications
You must be signed in to change notification settings - Fork 13
/
DataBasketball.Rmd
93 lines (71 loc) · 3.08 KB
/
DataBasketball.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
title: "Opening a Dataset"
titleshort: "Opening a Dataset"
description: |
Opening a Dataset.
core:
- package: r
code: |
setwd()
- package: readr
code: |
write_csv()
date: 2020-05-02
date_start: 2018-12-01
output:
pdf_document:
pandoc_args: '../_output_kniti_pdf.yaml'
includes:
in_header: '../preamble.tex'
html_document:
pandoc_args: '../_output_kniti_html.yaml'
includes:
in_header: '../hdga.html'
always_allow_html: true
urlcolor: blue
---
## Opening a Dataset
```{r global_options, include = FALSE}
try(source("../.Rprofile"))
```
`r text_shared_preamble_one`
`r text_shared_preamble_two`
`r text_shared_preamble_thr`
We have a dataset on basketball teams. The dataset, *Basketball.csv*, can be downloaded [here](https://github.com/FanWangEcon/Stat4Econ/tree/master/data/Basketball.csv).
We will load in the dataset and do some analysis with it.
### Paths to Data
**Relative Path**
The dataset is stored in a csv file. The folder structure for this file we are working inside and the data file is:
- main folder: Stat4Econ
+ subfolder: data
* file: Basketball.csv
+ subfolder: descriptive
* file: DataBasketball.ipynb (the jupyter notebook file)
* file: DataBasetball.html (the html version of the jupyter notebook file
overall this means:
- the csv file's location is: **'/Stat4Econ/descriptive/data/Basketball.csv'**
- the working R code file's location is: **'/Stat4Econ/descriptive/data/DataBasketball.ipynb'**
Given this structure, to access the *Basketball.csv* dataset, we need to go one folder up from our current subfolder to the mainfolder, and then choose the data subfolder, and the Basketball.csv file in the subfolder.
**Absolute Path**
If these files are not in the same main folder but are in different locations on your computer, you can find the full path to the csv path and copy paste the path below in between the single quotes.
search on google to find out how to get the full path to file:
- google search for [find full path for file on mac](https://www.google.com/search?q=find+full+path+for+file+on+mac)
+ this might end up looking like: **'/Users/fan/Downloads/Basketball.csv'**
- google search for [find full path for file on PC](https://www.google.com/search?q=find+full+path+for+file+on+mac)
+ this might end up looking like: **'C:/Users/fan/Documents/Dropbox/Basketball.csv'**
**Using Relative path to load in data**
We will load in the data using base R read.csv function.
- For what the variables mean, see [here](https://en.wikipedia.org/wiki/Basketball_statistics)
- For what NBA team names correspond to, see [here](https://en.wikipedia.org/w/index.php?title=Wikipedia:WikiProject_National_Basketball_Association/National_Basketball_Association_team_abbreviations).
```{r}
# We can load the dataset in first by setting our directory, then loading in the dataset
basetball_data <- read.csv('data/Basketball.csv')
```
```{r}
# Alternatively, we can just use one line
basetball_data <- read.csv('data/Basketball.csv')
```
```{r}
# Summarize all variables in data frame
summary(basetball_data)
```