# Getting Data from Mastodon

> Mastodon is an ActivityPub-based, Twitter-like federated social network node. It provides an API that allows you to interact with every aspect of the platform.

In addition to commonly used social media platforms, Mattermost is decentralized. Anyone can host their own Mattermost server and connect it to other servers.

This Python Notebook serves as a brief introduction on how to use the Mastodon API to request posts.

Lets embark our jurney!!

<img src="https://i.imgur.com/WuW3Qli.jpeg" alt="Embark on DataScience" width="200"/>


## API-Wrapper: Mastodon.py

Apart from basic HTTP requests to the API, there is a simple Python wrapper available that implements the complete Mastodon API. The [documentation](https://mastodonpy.readthedocs.io/en/stable/index.html) provides good examples on how to use the library. Additionally, I recommend checking out [Martin's blog post](https://martinheinz.dev/blog/86) on using Mastodon.py, which provides a great hands-on approach.

In [25]:
# installing and importing the API wrapper

# !pip install Mastodon.py
import mastodon
from mastodon import Mastodon

## Selecting a Mastodon Instance and Getting the API Token

Since Mastodon is organized in a decentralized manner, you need to choose a node to serve as your home instance. The selection of your home node has a significant impact on what will be shown on your timeline. You can find an overview of all Mastodon servers via this [link](https://joinmastodon.org/de/servers). For this notebook, we will be using [troet.cafe](https://troet.cafe) as it is the largest German server with approximately 8.5k active users. Please note that some Mastodon servers may prohibit or provide limited API access.

Instead of using your username and password for authentication, it is recommended to use an application token. This token can be easily revoked and allows you to define the permissions of the application more precisely.

To create an account on the Mastodon instance, visit https://*your-instance-url*/settings/applications. There, you can create a new application and obtain a new token.

<img src="https://i.imgur.com/KYh5JMB.png" width=500 />

The next step is to obtain the access token from the newly created application. Please ensure that you do not share your tokens, as anyone with access can perform actions on your behalf.

<img src="https://i.imgur.com/B4n6RaL.png" width=500 />

To ensure that the access token is not shared with other people, in this use case, I recommend adding the file "access_Token.txt" to the main directory of this repository. Please replace the file opening part with your access token.

In [26]:
# setting up the Mastodon API
instance_url = "https://troet.cafe" 
access_token = open("../../access_token.txt", "r").read()

m = Mastodon(access_token=access_token, api_base_url=instance_url)

## Getting Data

For the sake of social media analysis **we will focus only on getting data from the API**, please note that the wrapper can also be used to post data if you are interested in further details regarding that topic i can reccomend the documentation to you.

In [49]:
# get the local timeline
timeline = m.timeline_local()

In [50]:
# display the first post

timeline[0]

{'id': 112126817862289523,
 'created_at': datetime.datetime(2024, 3, 20, 7, 24, 22, 15000, tzinfo=tzutc()),
 'in_reply_to_id': None,
 'in_reply_to_account_id': None,
 'sensitive': False,
 'spoiler_text': '',
 'visibility': 'public',
 'language': 'de',
 'uri': 'https://troet.cafe/users/Kunstadresse/statuses/112126817862289523',
 'url': 'https://troet.cafe/@Kunstadresse/112126817862289523',
 'replies_count': 0,
 'reblogs_count': 0,
 'favourites_count': 0,
 'edited_at': None,
 'favourited': False,
 'reblogged': False,
 'muted': False,
 'bookmarked': False,
 'content': '<p>Liebe Regina, danke für deine motivierende Rückmeldung :-)<br />\xa0<br />Hallo Marianne, zunächst einmal, vielen Dank! Ich fühle mich bei der &quot;Kunstadresse&quot; sehr gut betreut. Ich finde es toll, wie du immer neue Wege findest, unsere Bücher vorzustellen.</p><p>Beste Grüße, Regina E.G. Schymiczek<br />\xa0</p><p>Jetzt Mitglied bei Kunstadresse werden!<br />Kunstadresse.de/code24</p><p>Kunstadresse </p><p><a href

As you can see in the return above the post entry includes different attributes you can use for your analysis


in the following we are going to focus on the application used to create a posting 

In [51]:
# get the application used for all posts from the timeline



for post in timeline:
    if "application" in post:
        print(post["application"]["name"])
#        TODO: put put the stuff into an array and save it to a file

    else:
        print("No application specified")


Blog2Social
Web
Mastodon for Android
Fedilab
Fedilab
Mastodon for iOS
Tusky
Web
Mastodon for Android
Web
Tusky
Tusky
Web
Tusky
Web
Mastodon for Android
Ivory for iOS
Mastodon for iOS
Tusky
Mastodon for Android


# writing the collected data to a file

In [18]:
# extract the id, username, timestamp and content of the timeline and write it to a csv in the data folder

import csv
with open('data/timeline.csv', mode='w') as timeline_file:
    timeline_writer = csv.writer(timeline_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    timeline_writer.writerow(['id', 'username', 'timestamp', 'content'])
    for post in timeline:
        timeline_writer.writerow([post['id'], post['account']['username'], post['created_at'], post['content']])