Skip to content

Commit

Permalink
added new article
Browse files Browse the repository at this point in the history
  • Loading branch information
downIoads committed Jan 4, 2024
1 parent 44cee78 commit 5b6bb68
Show file tree
Hide file tree
Showing 23 changed files with 839 additions and 15 deletions.
Binary file modified .DS_Store
Binary file not shown.
Binary file added .github/.DS_Store
Binary file not shown.
Binary file modified content/.DS_Store
Binary file not shown.
158 changes: 158 additions & 0 deletions content/posts/python-gitlab-backup-rpi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
---
title: "Automated self-hosted GitLab backups to Dropbox using a Python script on Raspberry Pi"
date: 2024-01-02T18:00:23+02:00
description: Simple Python script that updates your Gitlab config and data to your dropbox.
draft: false
tags: [python, programming, gitlab, crontab, rpi]
---

## Introduction

I prefer self-hosted GitLab over uploading my code to GitHub. The only disadvantage is that I need to take care of backups myself in case a hard drive dies or gets stolen. Therefore, I wrote a simple Python script that runs on my Raspberry Pi twice a week, which backs up my GitLab config and data and uploads them to my Dropbox. Specifically, I am using the free community edition of GitLab on my RPI. GitLab backups consist of two parts: There is the config (which includes the keys needed to decrypt the data backup) backup and the data backup. Technically, it is recommended to store these two backups separately, but I do not have any sensitive data in my GitLab projects, I just want to prevent data loss. This is why my script backs up both config and data backup into the same Dropbox account. In order for this script to work, you must set up a Dropbox app by using the [Dropbox developer site](https://www.dropbox.com/developers). I would recommend limiting the scope of the app into a single folder. This way, the token you put into the Python script can only read and write data to this folder and the rest of the Dropbox is untouched. Note that the token you can generate on the website where you adjust your Dropbox App settings is only temporary (a few hours) and will not be used by this script. Instead, read through [this reply](https://www.dropboxforum.com/t5/Dropbox-API-Support-Feedback/Issue-in-generating-access-token/m-p/592921/highlight/true#M27586) to learn how to use this token to get a permanent refresh token. The script itself then uses this refresh token to get a new authorization token each time it runs. The authorization token you get will always be valid for a few hours (which is enough for the script) and then become invalid. To sum it up: There is a "refresh token" which is used to request new "authorization tokens" and the authorization tokens are what you need to be able to access your Dropbox. I had to read lots of threads and documentation until I found the working answer that I linked above to figure this out.

Another note for setting up GitLab on your Raspberry Pi: If after installation you get a Nginx default page instead of your GitLab login website, it means that the port GitLab tries to use is already taken (e.g. by pihole). In the GitLab config, you have to assign a different, unused port to make GitLab work.

The script below assumes that GitLab Community edition runs on a Raspberry Pi with default OS and default GitLab backup locations.

## Python
```py

# PREREQ: sudo pip3 install dropbox
# USAGE: sudo python3 <scriptname.py>
import ast # string to dictionary
import dropbox
import os
import subprocess

# gitlab folder that stores config backup (required to decrypt data backup) and data backup
# assumes default locations for linux installation of gitlab community edition
LOCAL_GITLAB_CONFIG_PATH = "/etc/gitlab/config_backup"
LOCAL_GITLAB_DATA_PATH = "/var/opt/gitlab/backups"

DROPBOX_APP_KEY = "<you-have-to-edit-this-field>"
DROPBOX_APP_SECRET = "<you-have-to-edit-this-field>"
DROPBOX_REFRESH_TOKEN = "<you-have-to-edit-this-field>"
DROPBOX_DESTINATION_CONFIG = "/Config"
DROPBOX_DESTINATION_DATA = "/Data"

# access token only valid 4 hours, but refresh token always valid and used to get new access token.
# this function uses refresh token to get new valid access token
# read this to know to get refresh token in the first place: https://www.dropboxforum.com/t5/Dropbox-API-Support-Feedback/Issue-in-generating-access-token/m-p/592921/highlight/true#M27586
def getNewAccessToken():
refreshCommand = "curl https://api.dropbox.com/oauth2/token -d grant_type=refresh_token -d refresh_token=" + DROPBOX_REFRESH_TOKEN + " -u " + DROPBOX_APP_KEY + ":" + DROPBOX_APP_SECRET
result = subprocess.run(refreshCommand, shell=True, check=True, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
response = result.stdout # this is what you get back, contains the data you want
response_dict = ast.literal_eval(response)
return response_dict["access_token"]

# e.g. takes "gitlab_config_1702820194_2023_12_17.tar" and returns 1702820194 as int
# getTimestampFromConfigFilename("gitlab_config_1702820194_2023_12_17.tar")
def getTimestampFromConfigFilename(filename):
# files that are not gitlab config backups are ignored
if not "gitlab_config_" in filename:
return -1
return int(filename[filename.find("gitlab_config_") + 14:filename.find("_", filename.find("gitlab_config_") + 14)])


# takes path to dir and returns filename of newest config backup
def getNewestConfigBackupFilename():
# get list of all files in gitlab config backup dir
configBackupList = [f for f in os.listdir(LOCAL_GITLAB_CONFIG_PATH) if os.path.isfile(os.path.join(LOCAL_GITLAB_CONFIG_PATH, f))]

# check each filename and remember index of the name with highest timestamp (newest backup)
newestBackupIndex = -1
newestBackupHighestTimestamp = -1
for index, filename in enumerate(configBackupList):
if getTimestampFromConfigFilename(filename) > newestBackupHighestTimestamp:
newestBackupHighestTimestamp = getTimestampFromConfigFilename(filename)
newestBackupIndex = index
return configBackupList[newestBackupIndex]


# e.g. takes "1702758583_2023_12_16_16.5.1_gitlab_backup.tar" and returns 1702758583 as int
# getTimestampFromDataFilename("1702758583_2023_12_16_16.5.1_gitlab_backup.tar")
def getTimestampFromDataFilename(filename):
# files that are not gitlab data backups are ignored
if not "_gitlab_backup" in filename:
return -1
return int(filename[:filename.find("_")])


# takes path to dir and returns filename of newest data backup
def getNewestDataBackupFilename():
# get list of all files in gitlab data backup dir
dataBackupList = [f for f in os.listdir(LOCAL_GITLAB_DATA_PATH) if os.path.isfile(os.path.join(LOCAL_GITLAB_DATA_PATH, f))]

# check each filename and remember index of the name with highest timestamp (newest backup)
newestBackupIndex = -1
newestBackupHighestTimestamp = -1
for index, filename in enumerate(dataBackupList):
if getTimestampFromDataFilename(filename) > newestBackupHighestTimestamp:
newestBackupHighestTimestamp = getTimestampFromDataFilename(filename)
newestBackupIndex = index
return dataBackupList[newestBackupIndex]


def uploadToDropbox(accessToken, localFilePath, remoteFolder):
print("Will use this access token:", accessToken)
dbx = dropbox.Dropbox(accessToken)
with open(localFilePath, "rb") as f:
dbx.files_upload(f.read(), remoteFolder)


def main():
# get new access token (valid 240 min)
DROPBOX_ACCESS_TOKEN = getNewAccessToken()

# ensure local gitlab paths exist
if (not os.path.exists(LOCAL_GITLAB_CONFIG_PATH)) or (not os.path.exists(LOCAL_GITLAB_DATA_PATH)):
print("Local gitlabs paths could not be found. Terminating..")
return

# create gitlab backups using subprocess (must be run as sudo or will fail)
# for this reason you must also automate this in "sudo crontab -e" (sudo is important)
commandConfigBackup = "sudo gitlab-ctl backup-etc"
commandDataBackup = "sudo gitlab-backup create"

# make new backups (might take few minutes but program will wait)
subprocess.run(commandConfigBackup, shell=True)
subprocess.run(commandDataBackup, shell=True)


# get filename of newest gitlab config backup
newestConfigFilename = getNewestConfigBackupFilename()
newestConfigFullPath = LOCAL_GITLAB_CONFIG_PATH + "/" + newestConfigFilename
print("Newest config backup:", newestConfigFullPath)

# get filename of newest gitlab data backup
newestDataFilename = getNewestDataBackupFilename()
newestDataFullPath = LOCAL_GITLAB_DATA_PATH + "/" + newestDataFilename
print("Newest data backup:", newestDataFullPath)

# upload data (if it already exists nothing happens)
uploadToDropbox(DROPBOX_ACCESS_TOKEN, newestConfigFullPath , DROPBOX_DESTINATION_CONFIG + "/" + newestConfigFilename)
uploadToDropbox(DROPBOX_ACCESS_TOKEN, newestDataFullPath , DROPBOX_DESTINATION_DATA + "/" + newestDataFilename)

print("Successfully backed up Gitlab Config and Data.")


main()

```

## Features

Quickly reading over the code, you probably have noticed that the script determines the newest backups and only uploads them to your Dropbox. GitLab automatically puts the current timestamp in the filenames of your backups (note that there is a slightly different naming scheme for the config and the data backup filenames) which is what this script makes use of to determine the currently newest backup. Note: If your Dropbox does not have a lot of storage, you can earn like 18 GB of free permanent storage by making new accounts with your referral link. Since they only grant the storage if the desktop Dropbox app is installed by your friends, you might get the idea that you can set up a VM which reverts each time to simplify this process. But then you might notice that Dropbox will refuse to send out invitation e-mails after you do it three times and the fix might be to use a different browser (who knows, maybe only the browser agent is being checked as the only counter measurement of this strategy). This is all just guessing, though.

## Run this script twice per week using crontab

The script needs "sudo" to trigger the GitLab backups. For this reason, you must use "sudo crontab -e" instead of just "crontab -e". This ensures, that you have sudo rights when crontab executes the script. If you want to run the script twice per week, you could add the following line to your crontab (adjust paths as needed):
```bash
"0 4 * * 3,6 /usr/bin/python3 /home/pi/gitlab-backup.py"
```
This means the backup is triggered at 4:00 AM of every Wednesday (3) and Saturday (6). Note that the output of the script will be mailed to you. Run "sudo mail" to read the output of sudo crontab tasks (e.g. for this script) and use "mail" (without sudo) for the script triggered by your non-sudo crontab.

# Conclusion

The script is very useful for me, I wrote it once, and it regularly runs ever since. In the process of writing the script, I learned about the Dropbox API and how crontab has separated schedules for sudo and non-sudo. You could improve upon this script by deleting backups older than x, but I did not need this feature as my GitLab backups are not very large, and I have enough Dropbox storage.

7 changes: 7 additions & 0 deletions public/index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,13 @@
<guid>https://example.com/posts/d2-runewords/</guid>
<description>Introduction Diablo 2 Resurrected is one of my favorite games of all time, but there are a lot of things you need to remember. There are various online tools (the idea of a runeword calculator is not new), but many of them have a terrible User Interface, are outdated, or are located on slow websites that autoplay Twitch streams when you open them. So I decided to learn how to use wxWidgets by making an assistant GUI tool for my favorite game!</description>
</item>
<item>
<title>Automated self-hosted Gitlab backups to Dropbox using a Python script on Raspberry Pi</title>
<link>https://example.com/posts/python-gitlab-backup-rpi/</link>
<pubDate>Wed, 26 Jul 2023 18:00:23 +0200</pubDate>
<guid>https://example.com/posts/python-gitlab-backup-rpi/</guid>
<description>Introduction I prefer self-hosted Gitlab over uploading my code to Github. The only disadvantage is that I need to take care of backups myself in case a harddrive dies or gets stolen. Therefore, I wrote a simple Python script that runs on my Raspberry Pi twice a week which backs up my Gitlab config and data and uploads them to my dropbox. Specifically, I am using the free community edition of Gitlab on my rpi.</description>
</item>
<item>
<title>Python script for Llama 2 conversations</title>
<link>https://example.com/posts/llama-python/</link>
Expand Down
24 changes: 13 additions & 11 deletions public/page/2/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,17 @@ <h1 class="title"><a href="/posts/d2-runewords/">C&#43;&#43; runeword calculator
<a class="readmore" href="/posts/d2-runewords/">Read more ⟶</a>
</section>

<section class="list-item">
<h1 class="title"><a href="/posts/python-gitlab-backup-rpi/">Automated self-hosted Gitlab backups to Dropbox using a Python script on Raspberry Pi</a></h1>
<time>Jul 26, 2023</time>
<br><div class="description">

Simple Python script that updates your Gitlab config and data to your dropbox.

</div>
<a class="readmore" href="/posts/python-gitlab-backup-rpi/">Read more ⟶</a>
</section>

<section class="list-item">
<h1 class="title"><a href="/posts/llama-python/">Python script for Llama 2 conversations</a></h1>
<time>Jul 26, 2023</time>
Expand Down Expand Up @@ -129,17 +140,6 @@ <h1 class="title"><a href="/posts/mine-your-username/">Mine your username</a></h
<a class="readmore" href="/posts/mine-your-username/">Read more ⟶</a>
</section>

<section class="list-item">
<h1 class="title"><a href="/posts/how-to-write-a-tutorial/">How to write a tutorial</a></h1>
<time>Jul 2, 2023</time>
<br><div class="description">

Things to remember when you decide to write a tutorial.

</div>
<a class="readmore" href="/posts/how-to-write-a-tutorial/">Read more ⟶</a>
</section>



<ul class="pagination">
Expand All @@ -150,6 +150,8 @@ <h1 class="title"><a href="/posts/how-to-write-a-tutorial/">How to write a tutor
</span>
<span class="page-item page-next">

<a href="/page/3/" class="page-link" aria-label="Next"><span aria-hidden="true">Next →</span></a>

</span>
</ul>

Expand Down
94 changes: 94 additions & 0 deletions public/page/3/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
<!DOCTYPE html>
<html>
<head>
<meta name="generator" content="Hugo 0.121.1">
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge"><title>Blog for Tech Enjoyers | Home </title><link rel="icon" type="image/png" href=https://www.pngmart.com/files/23/Nerd-Emoji-PNG.png /><meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="" />
<meta property="og:image" content=""/>
<link rel="alternate" type="application/rss+xml" href="https://example.com/index.xml" title="Blog for Tech Enjoyers" />
<meta property="og:title" content="Blog for Tech Enjoyers" />
<meta property="og:description" content="" />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://example.com/" />

<meta name="twitter:card" content="summary"/><meta name="twitter:title" content="Blog for Tech Enjoyers"/>
<meta name="twitter:description" content=""/>
<script src="https://example.com/js/feather.min.js"></script>

<link href="https://fonts.googleapis.com/css2?family=IBM+Plex+Mono:ital,wght@1,500&display=swap" rel="stylesheet">
<link href="https://fonts.googleapis.com/css2?family=Fira+Sans&display=swap" rel="stylesheet">
<link href="https://fonts.googleapis.com/css?family=Roboto+Mono" rel="stylesheet">


<link rel="stylesheet" type="text/css" media="screen" href="https://example.com/css/main.af0932513936b0cde7d4a0d5917aa65438df9a57b01fb2769e2be4bdc492944f.css" />
<link id="darkModeStyle" rel="stylesheet" type="text/css" href="https://example.com/css/dark.726cd11ca6eb7c4f7d48eb420354f814e5c1b94281aaf8fd0511c1319f7f78a4.css" disabled />



</head>
<body>
<div class="content">
<header>
<div class="main">
<a href="https://example.com/">Blog for Tech Enjoyers</a>
</div>
<nav>

<a href="/posts">Posts</a>

<a href="/about">About</a>

<a href="/tags">Tags</a>

| <a id="dark-mode-toggle" onclick="toggleTheme()" href=""></a>
<script src="https://example.com/js/themetoggle.js"></script>

</nav>
</header>

<main class="list">
<div class="site-description"></div>



<section class="list-item">
<h1 class="title"><a href="/posts/how-to-write-a-tutorial/">How to write a tutorial</a></h1>
<time>Jul 2, 2023</time>
<br><div class="description">

Things to remember when you decide to write a tutorial.

</div>
<a class="readmore" href="/posts/how-to-write-a-tutorial/">Read more ⟶</a>
</section>



<ul class="pagination">
<span class="page-item page-prev">

<a href="/page/2/" class="page-link" aria-label="Previous"><span aria-hidden="true">← Prev</span></a>

</span>
<span class="page-item page-next">

</span>
</ul>


</main>
<footer>
<div style="display:flex"></div>
<div class="footer-info">
2024 <a
href="https://github.com/athul/archie">Archie Theme</a> | Built with <a href="https://gohugo.io">Hugo</a>
</div>
</footer>



</div>

</body>
</html>
2 changes: 2 additions & 0 deletions public/posts/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ <h1 class="page-title">All articles</h1>
<a href="/posts/golang-unintuitive-pitfalls/">Golang&#39;s unintuitive pitfalls</a> <span class="meta">Jul 28, 2023</span>
</li><li class="post">
<a href="/posts/d2-runewords/">C&#43;&#43; runeword calculator GUI for Diablo 2 Resurrected</a> <span class="meta">Jul 27, 2023</span>
</li><li class="post">
<a href="/posts/python-gitlab-backup-rpi/">Automated self-hosted Gitlab backups to Dropbox using a Python script on Raspberry Pi</a> <span class="meta">Jul 26, 2023</span>
</li><li class="post">
<a href="/posts/llama-python/">Python script for Llama 2 conversations</a> <span class="meta">Jul 26, 2023</span>
</li><li class="post">
Expand Down
7 changes: 7 additions & 0 deletions public/posts/index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,13 @@
<guid>https://example.com/posts/d2-runewords/</guid>
<description>Introduction Diablo 2 Resurrected is one of my favorite games of all time, but there are a lot of things you need to remember. There are various online tools (the idea of a runeword calculator is not new), but many of them have a terrible User Interface, are outdated, or are located on slow websites that autoplay Twitch streams when you open them. So I decided to learn how to use wxWidgets by making an assistant GUI tool for my favorite game!</description>
</item>
<item>
<title>Automated self-hosted Gitlab backups to Dropbox using a Python script on Raspberry Pi</title>
<link>https://example.com/posts/python-gitlab-backup-rpi/</link>
<pubDate>Wed, 26 Jul 2023 18:00:23 +0200</pubDate>
<guid>https://example.com/posts/python-gitlab-backup-rpi/</guid>
<description>Introduction I prefer self-hosted Gitlab over uploading my code to Github. The only disadvantage is that I need to take care of backups myself in case a harddrive dies or gets stolen. Therefore, I wrote a simple Python script that runs on my Raspberry Pi twice a week which backs up my Gitlab config and data and uploads them to my dropbox. Specifically, I am using the free community edition of Gitlab on my rpi.</description>
</item>
<item>
<title>Python script for Llama 2 conversations</title>
<link>https://example.com/posts/llama-python/</link>
Expand Down
Loading

0 comments on commit 5b6bb68

Please sign in to comment.