☁️ A databaseless cloud solution that only requires Python and standard *nix tools.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
assets
src
tests
.gitignore
Documentation
LICENSE
README.md
SPEC
SimpleCloud.conf
SimpleCloud_diagram
dependencies
mount-remote.sh

README.md

SimpleCloud

Cloud storage made simple.

Current state:

  • I'm going back to Paramiko sockets (pardon, Channels). It does everything I want and it's a perfectly valid solution. The only concerns are:
    • No true asynchronous behaviour. This can be countered with selector.select(). Frankly, I don't know if it's more or less efficient, and I know asyncio is THE true way to asynchronously check if a resource is available for reading or writing, but it's far too complicated right now
    • No webserver Web GUIs are for noobs. But seriously, I guess HTTPS login will get done with a basic websocket server on top (Flask+SocketIO) and with a password authentication instead of building SSL certificates out of SSH keys. Which is a disgusting method compared to what I want, but it's what everyone else does and such is life for now. I don't want to compromise future featuresets with decisions made now (which is why I spent weeks evaluating webservers), but at some point I have to call it quits and if the ssl module ever gets better I'll be happy to adapt to it.

Features:

We aim to create a simple cloud storage solution that doesn't reinvent the wheel. Almost all cloud storage systems run on an operating system with user lists, permissions and filesystems built into it and then add their own users, their own permissions and their own filesystems. This not only makes room for inconsistencies that have to be fixed manually, but it also generates useless overhead that affects performance. SimpleCloud exists to fix this.

  • Node architecture There is no concept of client or server in SimpleCloud past the point of who approached who first. It's all nodes.
  • External changes Since everyone is a node, SimpleCloud allows the node "acting as a backup server" to add and alter files and have those changes propagated to clients. You can have files downloading and transfered to a user folder when they're done, knowing they'll be available on the "node acting as a client".
  • No databases: well, at least not the kind where a filesystem is needlessly replicated on a database. There are no SimpleCloud users, only system users. There are no SimpleCloud directories, only system directories. There are no redundant records.
  • Stateless: no databases means no permanent state which means no inconsistencies. Everytime you start SimpleCloud it starts fresh, regardless of what changed since you shut it down.
  • No new protocols: SimpleCloud uses the stable, industry leading protocol SSH to ensure security and stability in connections, made by people much smarter and experient at cryptography than us. You can choose what sort of authentication to use, and you can re-use your existing public key.
  • Sync and stream: the user chooses, for each directory, how he wants to access it. SimpleCloud allows both syncing files from a client to a server or directly accessing the server's storage from the client.
  • Minimal server dependencies: all of the work is done on the client. The server is only expected to posess a user system and the ability to create a network filesystem connection (FUSE) through either SMB or SSH.

Python dependencies:

Preparing and encrypting the storage

Sources and further reading:
https://wiki.archlinux.org/index.php/Dm-crypt/Device_encryption

https://www.digitalocean.com/community/tutorials/how-to-use-dm-crypt-to-create-an-encrypted-volume-on-an-ubuntu-vps

On this guide we're assuming the server's system and the storage are physically separate. On our case the system is on a 60GB SSD while all the data is on a 1TB HDD. If you're using a brand new disk for storage it should be empty, but it's always good practice to clean it.

Finding the storage device

Either log in locally or by SSH onto your server, then find out which disk is which through lshw

# dnf install lshw
# lshw -class disk

This will list all the storage devices attached to your system, including model, manufacturer, capacity. You can compliment this with

$ lsblk

Which will list partitions and mountpoints from each device. Note down which is your storage disk. On our case it's sda.

Clean the device

Understand that any contents you currently have on the storage disk will be unretrievable after this step. I like to use shred, since it's made to actually shred files and devices.

shred -vzn 0 /dev/sda

You can also use dd:

# dd if=/dev/zero of=/dev/sda bs=10M status=progress
2621440000 bytes (2.6 GB, 2.4 GiB) copied, 11.0281 s, 238 MB/s => this line gets updated
95387+0 records in 95386+0 records out
1000204886016 bytes (1.0 TB, 932 GiB) copied, 6244.26 s, 160 MB/s

This will read a stream of bytes containing only 0's into the storage drive. Again, please make sure to use the correct drive path. This process takes a long time for a 1TB drive (in our case 6244 seconds, which is about 1h 45m). The bs=10M specifies that it should read 10 Megabyte blocks at a time and write them to the drive, which speeds things up at the expense of RAM. If you're serious about your security you can use of=/dev/urandom or even /dev/random instead, which read pseudo-random bytes instead of just zeros, but in turn it takes quite a while longer.

Encrypt the storage

(If you want, you can use this steps without root permissions by using the command chown <your username> /dev/sda)

By now the storage device, which in our case was /dev/sda, should contain only 0's. Many guides work on the assumption that we want to create one file inside a partition to use as an encrypted container, or to encrypt a partition. In this case we chose to have an entire hard drive dedicated to storage and encrypted, so we'll skip a few steps that you'd have to do in those instances. If you're working with a single storage device, you'll probably want your encrypted storage sitting on an already-existing partition, in that case check out the very well-written Digital Ocean guide below. We can easily encrypt it with cryptsetup.

# cryptsetup -yv luksFormat /dev/sda

This step takes a few seconds. it will prompt you for a password (twice, since we used the -y flag). When it's done, the hard drive has been formatted as a LUKS filesystem, which means files introduced will automatically be encrypted, and files read will be decrypted. Confirm this with another crypsetup command:

$ cryptsetup luksDump /dev/sda
Version: 1
Cipher name: aes
Cipher mode: xts-plain64
Hash spec: sha256
Payload offset: 4096

LUKS surrounds an existing filesystem device with encryption, you essentially open a LUKE device to reveal a filesystem inside it. So we'll need to create one. For that, let's start by opening our container:

# crypsetup luksOpen /dev/sda things

The decrypted device /dev/mapper/things will be created. This isn't an actual physical device, that was the /dev/sda disk, it's just the contents of the LUKS filesystem, which so far contain nothing. Let's create an ext4 filesystem inside.

# mkfs -t ext4 /dev/mapper/things

Finally, an optional step: Linux filesystems are by default created with about 5% space reserved. This made sense in the 90's for system drives where your space was very limited since filling up a drive entirely might render the system unusable, but for a 1TB drive that only serves as backup, it's irrelevant. So we can set the reserved space to 0:

tune2fs -m 0 /dev/mapper/things

Opening and closing the encrypted device

Crytpsetup containers can sometimes freeze doing I/O, and you won't be able to close them while a process is accessing them.

# cryptsetup close things
device-mapper: remove ioctl on things failed: Device or resource busy => this line gets repeated a lot
Device things is still in use.

To do this you need to find the PID for that process and kill it.

# dmsetup ls

This returns something like

things (253:0)

Next find the process responsible:

# lsof | grep 253,0

The second column contains the PID. Just kill it with the aptly-named command kill:

# kill -9 <PID>

And now you sould be able to close the container:

# cryptsetup close things