In [1]:
options(jupyter.rich_display=FALSE)

# Problem 1: Multiple-choice test grading with adjustment

(Scoring: Each part 6 points. Total 30 points.)

In this problem, you are going to read test responses from a file and generate a data frame with the scores for each student.

The sample input data file *assg2-1.data* has the following form:
```
key,A,B,C,D,A
ahmet,A,NA,NA,D,A
canan,A,D,NA,D,A
kemal,D,C,B,A,A
meral,A,B,C,D,D
ziya,C,C,C,D,A
mine,NA,A,C,D,NA
```

Every row corresponds to one student. The first column gives the student's name, and the remaining columns hold the response of the student to that particular question. An `NA` indicates that the student did not answer the question. For example, `ahmet` replied A to the first question, left questions 2 and 3 unanswered, replied D to the fourth question and A to the last question.

The first row gives the key, correct answers to each question. All the other rows should be compared against it.

You should calculate the score of each student such that:
* three false responses cancels one correct response,
* the lowest possible score is zero (no negatives).

Your program will be tested with another file that has the same structure but different names and responses. The number of rows and number of columns may differ. An empty response will always be `NA`. Write your code to be as general as possible within the problem description.

**(A)** Write a function named **read_data** as given below:

```r
read_data <- function(path) {
    return(read.csv(path, header=F, sep=",", row.names=1, stringsAsFactors=F))
}
```

Download the data file given with this assignment and execute the following command:

You can read the data in the file into a data frame with the following command:

```
> testscores <- read_data("responses.csv")
> testscores
        V2   V3   V4 V5   V6
key      A    B    C  D    A
ahmet    A <NA> <NA>  D    A
canan    A    D <NA>  D    A
kemal    D    C    B  A    A
meral    A    B    C  D    D
ziya     C    C    C  D    A
mine  <NA>    A    C  D <NA>
```
If you get an error message, probably the file is not in the directory your script is running. Either carry it to your working directory, or give the full path name together with the file name.

You don't have to use the name `testscores`.

**(B)** Write a function named **ncorrect** that takes one row of the data frame and returns the number of correct responses.

Example:

```
> ncorrect(testscores["ahmet",], testscores[1,])
[1] 3
```

A useful trick: If you want to exclude `NA` values from a sum, you can set the `na.rm` option to `TRUE`.
```
> sum(c("abc",NA,"xyz") == c("abc","def","xyz"), na.rm=T)
[1] 2
```

**(C)** Write a function named **nfalse** that takes one row of the data frame and returns the number of **false** responses. Note that `NA` values should be excluded.

Example:

```
> nfalse(testscores["ahmet",], testscores[1,])
[1] 0
```

**(D)** Write a function named **checkscores** that takes the original data frame (e.g. `testscores` as defined above) and returns a new data frame with the number of correct and false responses for each student.


Usage example:
```
> checkscores(testscores)
        V2   V3   V4 V5   V6 ncorrect nfalse
key      A    B    C  D    A        5      0
ahmet    A <NA> <NA>  D    A        3      0
canan    A    D <NA>  D    A        3      1
kemal    D    C    B  A    A        1      4
meral    A    B    C  D    D        4      1
ziya     C    C    C  D    A        3      2
mine  <NA>    A    C  D <NA>        2      1

```

Hint: An easy (but not the only) way is to use the built-in **apply()** function with the **ncorrect()** and **nfalse()** functions you created above.

**(E)** Write a function named **adjustedscores** that takes the original data frame (e.g. `testscores` as defined above) and returns a new data frame with three new columns: The number of correct and false responses, and the adjusted score for each student.

The adjusted score is calculated such that each false response reduces the number of correct responses by one-third. If the score turns out to be negative (e.g. one correct and four false responses), it should be set to zero.

Example:
```
> adjustedscores(testscores)
        V2   V3   V4 V5   V6 ncorrect nfalse adjusted
key      A    B    C  D    A        5      0 5.000000
ahmet    A <NA> <NA>  D    A        3      0 3.000000
canan    A    D <NA>  D    A        3      1 2.666667
kemal    D    C    B  A    A        1      4 0.000000
meral    A    B    C  D    D        4      1 3.666667
ziya     C    C    C  D    A        3      2 2.333333
mine  <NA>    A    C  D <NA>        2      1 1.666667
```

# Solution

In [2]:
read_data <- function(path) {
    return(read.csv(path, header=F, sep=",", row.names=1, stringsAsFactors=F))
}

scores <- read_data("~/tests/assignments/2019_2020_2/responses_test.csv")

scores

      V2 V3 V4 V5 V6 V7
key   B  A  A  A  D  D 
ahmet B  C  B  B  C  NA
kemal NA A  C  A  NA D 
mert  NA D  A  A  D  A 
kadir D  A  D  C  C  C 
kaan  B  A  A  A  D  D 

part A

In [3]:
ncorrect <- function(student, key) {
    return(sum(student == key, na.rm = T))
}

ncorrect(scores["kaan",], scores[1,])

[1] 6

part B

In [4]:
nfalse <- function(student, key) {
    return(sum(student != key, na.rm = T))
}

nfalse(scores["mert",], scores[1,])

[1] 2

part C

In [5]:
checkscores <- function(answers) {
    answers$ncorrect <- apply(answers, 1, ncorrect, answers[1,])
    answers$nfalse <- apply(answers[,-length(answers)], 1, nfalse, answers[1,-length(answers)])
    return(answers)
}

checkscores(scores)

      V2 V3 V4 V5 V6 V7 ncorrect nfalse
key   B  A  A  A  D  D  6        0     
ahmet B  C  B  B  C  NA 1        4     
kemal NA A  C  A  NA D  3        1     
mert  NA D  A  A  D  A  3        2     
kadir D  A  D  C  C  C  1        5     
kaan  B  A  A  A  D  D  6        0     

Part D

In [6]:
adjustedscores <- function(answers) {
    calculated <- checkscores(answers)
    calculated$adjusted <- calculated$ncorrect - calculated$nfalse/3
    calculated$adjusted[calculated$adjusted<0] <- 0
    return(calculated)
}

adjustedscores(scores)

      V2 V3 V4 V5 V6 V7 ncorrect nfalse adjusted
key   B  A  A  A  D  D  6        0      6.000000
ahmet B  C  B  B  C  NA 1        4      0.000000
kemal NA A  C  A  NA D  3        1      2.666667
mert  NA D  A  A  D  A  3        2      2.333333
kadir D  A  D  C  C  C  1        5      0.000000
kaan  B  A  A  A  D  D  6        0      6.000000

part E