-
Notifications
You must be signed in to change notification settings - Fork 10
/
ProjectTemplate.Rmd
77 lines (51 loc) · 2.65 KB
/
ProjectTemplate.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
title: "Research project paper template"
date: 19.01.2018
author: Olga Lyashevskaya
output:
html_document:
toc: true
toc_float: true
code_folding: show
---
```{r setup, echo=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
### Materials
[Link to the data set](https://github.com/...) (csv file)
[Links to additional scripts, with comments](https://github.com/...) (optional)
[Links to other supplementary materials (questionnaires, etc.)](https://github.com/...) (optional)
## Introduction
Describe the phenomenon.
Provide information about the language (only for minor languages).
Mention some previous research (optional).
## Research hypothesis
...
Discuss possible relevant factors.
Formulate null hypothesis.
## Data
* Dependent variable, its type
* Predictor variables, their types, range/levels
* Number of observations
```{r dataset}
df <- read.csv("example.csv")
summary(df)
str(df)
```
### Data collection and annotation
Mention the source of your data, details of data collection. Justify the amount of data under study. In addition, you can discuss certain difficulties, peculiarities and shortcomings in data collection and annotation.
### Data considerations
Discuss research design (if applicable), independence, autocorrelation, nestedness of data, possible biases, etc.
## R libraries in use
```{r libraries}
library(tidyverse)
# include R libraries here or later
```
## Analysis: descriptive statistics
In this section you can put tables and plots that show data distribution for individual variables and distribution for pairs of variables. Histograms, density plots, boxplots (or violin plots), mosaic plots are recommended. You can also provide some basic statistical analysis, such as correlation, statistical significance, t-test and other simple tests. If you calculate a chi-square statistic, do not forget to estimate the effect size.
## Multi-factor analysis
At least two different methods have to be used. Please provide the output models summaries and graphs. Please evaluate your models (show and interpret relevant indicators such as classification accuracy, goodness of fit, classification power, inertia, variable significance, variable importance, etc.).
It is recommended to report, when needed, how did you fit the model, did you find any signs of overfitting, why do you believe that certain model is an optimal one, etc.
## Linguistic interpretation of the quantitative results
## Discussion on data distribution and quantitative methods in use
In conclusion, you can suggest ideas for further development of your research (correction of hypotheses, work with the data, their statistical analysis).