# Mattermost Lunch Channel History

- Data Source: [Mattermost API](https://api.mattermost.com/), [CCTB instance](https://cctb-intern.biologie.uni-wuerzburg.de/)
- Tasks:
	- Part I - June 2024: retrieving chat history data through the mattermost API
	- Part II - September 2024: analyzing messages in the lunch channel
	- Part III - September 2024: specific tasks
- Language: [R](https://www.r-project.org/)

## Select one of the following tasks

> General comment: your estimate in step 1 does not need to be perfect, settle for a heuristic that is good enough

### Task A - most crowded day of the week
- estimate the total number of people having lunch (or coffee) at the CCTB/mensa for each day → when was the time that most people went to lunch?
- plot the number of people per day over time (also try to summarize by week/month/year)
- plot a boxplot for the number of people per day of the week → what is the most crowded day of the week?
- make the same plot as above, but separately for every month/year → is there a shift in day of the week preference?
- perform a statistical test for the hypothesis: "Mondays and Fridays are less crowded than Tuesday to Thursday"
- discuss caveats of the data and methods used

### Task B - lunch time
- estimate the time of lunch/coffee for each day → when was the most popular time?
    - try to consider proposed times ("mensa at 12?", "11:15?")
    - direct calls ("mensa?", "now")
    - relative times ("lunch in 5min", "mensa in half an hour?")
- plot the lunch time over the years (also try to summarize by week/month/year) → is there a trend (gradual shift or break point(s)) in lunch time?
- plot a boxplot for the lunch time per day of the week → is there a difference in lunch time per day of the week?
- make the same plot as above, but separately for every month/year → is the pattern above consistent over the year(s)?
- perform a statistical test for the hypothesis: "Lunch time is later during semester break (April,May,August,September) than during lecture period since 2022"
- discuss caveats of the data and methods used

### Task C - your own idea
If you have other ideas, feel free to follow them, but create a plan similar to that for Task A and B above, before you start.

In [1]:
library(tidyverse)

── [1mAttaching core tidyverse packages[22m ──────────────────────── tidyverse 2.0.0 ──
[32m✔[39m [34mdplyr    [39m 1.1.4     [32m✔[39m [34mreadr    [39m 2.1.5
[32m✔[39m [34mforcats  [39m 1.0.0     [32m✔[39m [34mstringr  [39m 1.5.1
[32m✔[39m [34mggplot2  [39m 3.5.0     [32m✔[39m [34mtibble   [39m 3.2.1
[32m✔[39m [34mlubridate[39m 1.9.3     [32m✔[39m [34mtidyr    [39m 1.3.1
[32m✔[39m [34mpurrr    [39m 1.0.2     


── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mpurrr[39m::[32m%||%()[39m   masks [34mbase[39m::%||%()
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()
[36mℹ[39m Use the conflicted package ([3m[34m<http://conflicted.r-lib.org/>[39m[23m) to force all conflicts to become errors


In [2]:
theme_set(theme_light())

In [3]:
set.seed(42)
"Jana Sascha Mike Markus" %>%  str_split(" ") %>% unlist %>% sample %>% str_c(collapse=" → ")

## 1. Data loading

Load files:
- `messages.csv`
- `reactions.csv`
- `files.csv`

In [4]:
messages <- read_csv("messages.csv")
reactions <- read_csv("reactions.csv")
files <- read_csv("files.csv")

[1mRows: [22m[34m6949[39m [1mColumns: [22m[34m6[39m


[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m  (3): post_id, message, username
[32mdbl[39m  (2): num_reactions, num_files
[34mdttm[39m (1): create_at



[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


[1mRows: [22m[34m3630[39m [1mColumns: [22m[34m5[39m


[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m  (4): post_id, emoji_name, username, emoji
[34mdttm[39m (1): create_at



[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


[1mRows: [22m[34m221[39m [1mColumns: [22m[34m3[39m


[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m (3): post_id, file_id, link



[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.
