# Discord Server Activity Pattern (Valorant)

# - Sahil Makhijani, Vinay Warrier

# Introduction

This project, Discord Server Activity Pattern, revolves around the analysis of the user activity on the Official ‘VALORANT’ Discord Server, as it is a space that both the members of this project are actively part of. Our team consists of Sahil Makhijani and Vinay Warrier, where we both are part of this server since the beginning. As active users of this server,  we’ve always been curious about the patterns of user behavior and interactions. This is what led us to make this project as we thought this would be a fun way to convert our personal experiences into a deep analysis on this server. Our dataset covers from March 2020, when the server was set up, to April 2021, a fair one-year window into the trends and patterns in user activity. This duration of one year would be perfect for us to identify significant changes and trends of the server activity.


In our work, we demonstrated a variety of techniques for visualizing and processing the data set. We applied multivariate visualizations where we encoded various data by color, marker, 3D Axes and interactivity, which makes it easier to display relationships among multiple variables at once. Different kinds of Word Cloud visualizations were applied to better analyze the frequently used words in the chat and get insights from them. Sentiment Analysis of user messages was applied in order to detect the overall mood of the community and some minor shifts on the sentiments over the period of one year. These applications help to understand the bigger picture of the server members, who’s the most active, and what kind of content erupts the engagement ratio.

The data for this project was fetched by us with the help of an automated bot, Discord Chat Exporter (https://github.com/Tyrrrz/DiscordChatExporter).

# Load & View Dataset

In [1]:
%pip install -U -r requirements.txt

Note: you may need to restart the kernel to use updated packages.




In [2]:
from data import *
from viz import *

In [3]:
df = get_data()
df.head()

Unnamed: 0,AuthorID,Author,Date,Content,Attachments,Reactions
0,153581473545322496,midimint,2020-03-02T13:28:44.5150000+05:30,https://gfycat.com/glamorousrewardingdalmatian...,,"🅱️ (333),⭕ (252),🇳 (231),OwO1 (269),NO (189),🇮..."
1,136625373814325248,sungravity,2020-03-02T13:28:48.2400000+05:30,:eyesFlipped:,,"👀 (55),blobcookie (37),aletter_w (35),aletter_..."
2,153581473545322496,midimint,2020-03-02T13:29:09.6410000+05:30,Courtesy of @Trooper,,"🇫 (39),🅰️ (35),🇷 (36),🇹 (35),🇪 (34),🇩 (30)"
3,214489850701807616,trooperhx,2020-03-02T13:29:32.0520000+05:30,:blobeyesderpy:,,"🍕 (25),🇼 (31),🇦 (31),🇸 (28),🇭 (28),🇪 (28),🇷 (2..."
4,447303538662572043,ekhros,2020-03-02T13:29:52.0400000+05:30,:Jotaro_Party:,,"🇻 (29),🇮 (29),🇧 (29),🇪 (28),🇺 (24),🇸 (23),📧 (1..."


# Motivation

The project is derived from personal experience, both of us being active users of Discord since 2018 and active Valorant players since its early beta release in early 2020, this became a perfect choice due to the experience and knowledge gained from the platform and the game could be useful for analysis. Valorant’s Discord server is the most appropriate as we can offer deep insights on the game itself and also about the community. It is important to note that our work does not just focus on one game or just one platform. By narrowing down our topic to just Valorant and Discord, we can gather a number of insights which can be generalized to a far larger pool of games and social platforms. The analysis can support moderators to understand user activity and how to keep them engaged. Game Developers can find out what features really work for engagement and acknowledge features responsible for disengagement. Traditional review system is filtered, but this analysis focuses on unfiltered opinions shared in casual chat discussions, providing a deeper understanding on user behaviour which helps companies to develop more engaging platforms and enhance game experiences


# Methods

The methods used in this project are as follows:

#### *Exploratory Data Analysis (EDA):*

Understanding the structure and quality of the dataset was ensured by EDA. Data cleaning, handling missing values and summarizing statistics are the main steps utilized during EDA. This method gave us a detailed overview of how the dataset showed the trends and the outliers. We chose this approach because it forms the basis for all other analysis methods which ensures that the insights are derived from a clean and accurate dataset.

In [4]:
eda(df)

1. How much data do I have?
(7184184, 6)

2. What does my data look like?


Unnamed: 0,AuthorID,Author,Date,Content,Attachments,Reactions
0,153581473545322496,midimint,2020-03-02T13:28:44.5150000+05:30,https://gfycat.com/glamorousrewardingdalmatian...,,"🅱️ (333),⭕ (252),🇳 (231),OwO1 (269),NO (189),🇮..."
1,136625373814325248,sungravity,2020-03-02T13:28:48.2400000+05:30,:eyesFlipped:,,"👀 (55),blobcookie (37),aletter_w (35),aletter_..."
2,153581473545322496,midimint,2020-03-02T13:29:09.6410000+05:30,Courtesy of @Trooper,,"🇫 (39),🅰️ (35),🇷 (36),🇹 (35),🇪 (34),🇩 (30)"
3,214489850701807616,trooperhx,2020-03-02T13:29:32.0520000+05:30,:blobeyesderpy:,,"🍕 (25),🇼 (31),🇦 (31),🇸 (28),🇭 (28),🇪 (28),🇷 (2..."
4,447303538662572043,ekhros,2020-03-02T13:29:52.0400000+05:30,:Jotaro_Party:,,"🇻 (29),🇮 (29),🇧 (29),🇪 (28),🇺 (24),🇸 (23),📧 (1..."


Unnamed: 0,AuthorID,Author,Date,Content,Attachments,Reactions
6275027,468472962891120642,aze0s,2021-03-23T14:11:59.9500000+05:30,Well the game won’t tell you if they banned so...,,
4694873,764413976662376470,m4m1hh,2021-01-02T00:59:03.9050000+05:30,I'm looking through the translate so I'm givin...,,
4310847,748439757800996874,rafsan_,2020-12-06T18:27:50.9110000+05:30,WAIT WHAT,,
5914680,456226577798135808,Deleted User,2021-03-08T01:52:38.0960000+05:30,"ill give you ,y bag of uncooked popcorn",,
1872021,429772173767344128,sunn9884,2020-07-13T03:18:45.2250000+05:30,Me when ❄️,,



3. What are the data types of each column?
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7184184 entries, 0 to 7184183
Data columns (total 6 columns):
 #   Column       Dtype 
---  ------       ----- 
 0   AuthorID     int64 
 1   Author       object
 2   Date         object
 3   Content      object
 4   Attachments  object
 5   Reactions    object
dtypes: int64(1), object(5)
memory usage: 328.9+ MB


None


4. Are there any missing values?


AuthorID             0
Author               0
Date                 0
Content          15614
Attachments    7168748
Reactions      7117783
dtype: int64


5. Are there duplicate values?


np.int64(0)




In [5]:
fill_na(df, "")

clean_column_names(df)
to_proper_types(df)

drop_column(df, "attachments")
drop_rows(df, df.content == "")

add_sentiment(df, "content")  # < Most Time Consuming T-T
add_date_derivatives(df, "date")
add_total_reactions(df, "reactions")

In [6]:
print(df.shape)
df.head()

(7168570, 13)


Unnamed: 0,authorid,author,date,content,reactions,sentiment,hour,day,month,year,day_name,month_name,reactions_total
0,153581473545322496,midimint,2020-03-02 13:28:44.515000+05:30,https://gfycat.com/glamorousrewardingdalmatian...,"🅱️ (333),⭕ (252),🇳 (231),OwO1 (269),NO (189),🇮...",0.0,13,0,3,2020,Monday,March,3956
1,136625373814325248,sungravity,2020-03-02 13:28:48.240000+05:30,:eyesFlipped:,"👀 (55),blobcookie (37),aletter_w (35),aletter_...",0.0,13,0,3,2020,Monday,March,559
2,153581473545322496,midimint,2020-03-02 13:29:09.641000+05:30,Courtesy of @Trooper,"🇫 (39),🅰️ (35),🇷 (36),🇹 (35),🇪 (34),🇩 (30)",0.3612,13,0,3,2020,Monday,March,209
3,214489850701807616,trooperhx,2020-03-02 13:29:32.052000+05:30,:blobeyesderpy:,"🍕 (25),🇼 (31),🇦 (31),🇸 (28),🇭 (28),🇪 (28),🇷 (2...",0.0,13,0,3,2020,Monday,March,325
4,447303538662572043,ekhros,2020-03-02 13:29:52.040000+05:30,:Jotaro_Party:,"🇻 (29),🇮 (29),🇧 (29),🇪 (28),🇺 (24),🇸 (23),📧 (1...",0.0,13,0,3,2020,Monday,March,443


#### *Interactive Visualizations*

Certain visualizations were made interactive for the live analysis of the dataset. Using an interactive dashboard or a plot, it is allowed to filter and change the view to focus on, like certain periods of time or kinds of activities. This is a useful method as it makes the interaction with the data more flexible. These plots can help the user play with the data themselves which would help them find the specific insight they are trying to find out. The main reason to utilize this is to provide ease to the moderators and the game developers who are interested in exploring the data by themselves.

#### *Multivariate Statistics*

This method was utilized to display the relationship between two or more variables, as the name suggests ‘multivariate’. This approach improved the visualization through scatter plots and heat maps that show the covariance between dependent and independent trends in the data points. Multivariate Statistics provide greater insight into the user interactions, for example, the relationship between the reactions, the user sentiments and the date. This can display a better version of an interactive between the factors which would be a problem while using purely univariate data.

#### *Time Series Analysis*

This analysis is used to show how user activity and message frequency varied over time. This method focused on identifying patterns based on daily, weekly and monthly server activity and also any anomalies in engagement patterns. We chose this method to track trends and patterns that occur seasonally within the community. With this approach, we could show the most active times of the day, week or month, which can be helpful for the moderators and the game developers alike.

#### *Word Cloud*

We developed a word cloud of the general chat to analyze the most frequent terms. It gives an overview of what is fast and common to talk about and the usage of catchphrases among the community. Most frequently used words use larger fonts and the size decreases gradually by the frequency of the term. This gives a feeling of what matters most for the community. This method was chosen due to its simplicity  and an appealing way to underline key themes and trends without any complex statistical analysis.

#### *Sentiment Analysis*

This was utilized to understand the overall mood of the user’s based on their messages. This method classifies  messages as positive, negative, or neutral, giving an idea of the emotional state of the community. This analysis tracks sentiment shifts over the period of one year to determine if game updates, announcements or some in-game events caused a significant impact on the community sentiment. The reason for choosing sentiment analysis is to offer insights into user engagement as it reveals patterns that are not visible through other common exploratory methods.

# Insights

#### *Dashboard*

In [7]:
vd = ValochatDashboard(df)
vd.layout


Converting to PeriodArray/Index representation will drop timezone information.


Converting to PeriodArray/Index representation will drop timezone information.



BokehModel(combine_events=True, render_bundle={'docs_json': {'a4d1e2aa-deca-4d6b-9041-440b352ba4f6': {'version…

#### *Active Contributors:*

Our work identifies “Deleted User” as the most active user but it is the default created by Discord when someone deletes their account from their website. Therefore, we can’t accept it as the most active user as it identifies several people as everyone who deletes their account gets their name converted to “Deleted User” if someone had already tagged them in the chat. The most active contributor will be “anthony813”. This user has sent over 60,000 messages from March of 2020 to April of 2021. This user is still active on the server even in 2024 and he’s been boosting the server since September of 2023, which shows his commitment to the community. He has thus become one of the most valuable and consistent contributors to the official Valorant discord server.

#### *Weekly Chat Activity:*

Based on the chat activity we visualized, the insight we gained from it was that most of the chats occur during Tuesdays and Wednesday, which was the peak,  and then rate kept declining till it reached the lowest on Saturdays and kept rising back up. At this time, global pandemic COVID-19 was between the start and the peak. Most people just stayed at home and many of the work and school schedules changed due to it. This made people use discord more after, or during,  their school or work as this was a great way for communication and entertainment. Main reason for this could be the midweek fatigue that could have caused them to interact more during this period. The other reason could be they just chatted a lot during the weekdays and they were on voice chat during the weekends, which caused the message rate to drop as they were mostly on calls for communication.

#### *Most Used Terms:*

Generally the most used terms in the chat, based completely on the word cloud analysis, were ‘play’, ‘lol’, ‘good’, ‘server’, ‘ye’, which are usually used in any server as they are commonly used words while communicating. The main difference between other servers and this one is the word ‘play’ or the word ‘ye’. These words are mostly used in gaming servers only when they are telling someone to press play to start the game when the squad is ready. Another one we found curious is the word ‘ye’ as usually we use ‘yes’ but people in gaming servers use ‘ye’ more, otherwise the users could be talking about the potential change of the famous artist ‘Kanye West’ changing his name to ‘Ye’. We could be wrong about it but these are the ones we could’ve thought of as Kanye changed his name in late 2021 and he was thinking of changing it way before that.

#### *Most Used In-Game Terms:*

We made a list of over 60 in-game terms and made a word cloud analysis based on that. Out of them the most used words were ‘valorant’, ‘reyna’, ‘jett’, ‘phantom’ and ‘vandal’. ‘Valorant’ makes sense as they use the word as the server is based on that game. ‘Reyna’ and ‘Jett’ were commonly used as they are two of the most popular agents at that time with ‘Reyna’  was just released, early June of 2020, as it was the most overpowered agent in the game for a very long time. ‘Jett’ is also a really overpowered and broken agent since the launch of the game, it became like a normal agent when it was nerfed in 2023, so by the time the data was gained it was still one of the top 5 agents. The words ‘Vandal’ and ‘Phantom’ are the most commonly used guns in the game with the buy and pick rate of those weapons being the most, compared to any other weapon in the game.

#### *Peak Activity:*

Based on our time series analysis, the day chat activity was at its peak was in the start of March, particularly on 2nd of March, 2021. The reason for that was the release of the new update of Episode 2 Act II.. One of the main things from this update was the  release of a  new agent, ‘Astra’, which had a new ability of creating a wall for the whole map separating it into two halves for a minute. This was really overpowered back at that time. With these, a new skin bundle was launched, ‘Prime 2.0’.  With the very great success of the ‘Prime’ bundle, released in June of 2020, the highly anticipated Prime 2.0, mainly the knife, which was the ‘butterfly knife’, was expected to be the best bundle of that time. The Butterfly knife is an inspiration of a type of skin from the game CS:GO where this particular knife is too costly, upwards of $1000 on the Steam Market, and is one of the best skins of that game.

# Conclusion

In the Discord Server Activity Pattern project, we understood about the user activity and engagement trends on the Official Valorant Discord server from March 2020 to April 2021. Using techniques such as Word Cloud Analysis, we identified frequently used terms, including in-game references such as "Reyna," "Jett," "Phantom," and "Vandal," which reflect the user base’s focus on popular agents and weapons. Sentiment Analysis showed user mood shifting with major game updates and announcements. Our Time Series Analysis pointed out the peak activity times in Episode 2 Act II, as well as the release date of the Prime 2.0 skin bundle. The pattern that emerged in the weekly cycle of chat activities focused toward increased midweek activity, because COVID-19 has changed work from home and school from home schedules. We also found "anthony813" to be the most active contributor, with consistent engagement in the long term. These findings provide insights into community behavior which can help moderators handle engagement strategies in different ways and help developers understand which features appeal most to players and which ones don’t.