Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan to reduce my Linode cost, to be executed by end of 2021 #6

Open
vipulnaik opened this issue Jan 4, 2020 · 1 comment
Open
Assignees

Comments

@vipulnaik
Copy link
Owner

vipulnaik commented Jan 4, 2020

Background

Currently, I'm using the $160/month Linode + $40/month for backups. The specs of this Linode are 32 GB, 8 cores, 640 GB SSD.

The main reason why I'm using such a high-power Linode is disk storage (currently, I am using 216 GB in disk storage out of the 640 GB limit, but the number can fluctuate). The main reason I need that amount of disk storage is Wikipedia Views. Specifically, Wikipedia Views costs about 50 GB for the wikipedia-clickstream folder, and I believe it accounts for at least 30% (and probably closer to 50%) of the 89 GB that mysql is taking up on the Linode.

Wikipedia Views also has a high data growth rate:

  • Clickstream: This is growing at the rate of 2 GB per month. Current size is 52 GB; by the end of 2020, the size will be over 75 GB, and by the end of 2021, the size will be over 100 GB.
  • MySQL: The number of rows grows in the range of 6-7 million per month; the current number of rows is about 322 million, so by the end of 2020, the number of rows will be over 400 million. By the end of 2021, it'll be almost 500 million.

Fallback extreme plans

  • Continue with status quo. I believe that the status quo will last till around 2022 or 2023 without the need to upgrade the instance. This isn't ideal, because it means I'm shouldering a high expense ($200/month) when I need not.
  • Shut Wikipedia Views down, completely or partially, by 2023:
    • For instance, stop downloading and updating clickstream data. This automatic sunset of clickstream data will have minimal impact on the regular use of WV.
    • Stop caching pageview counts in MySQL except for explicitly queried pages (in other words, stop running the monthly jobs to fill data automatically into MySQL). That way, only pages that people are directly querying get recorded in MySQL.

Better plan that involves no sacrifice

A better plan would be to move various folders to Linode block storage:

  • Move wikipedia-clickstream to Linode block storage
  • Move backups folders to Linode block storage (although these aren't solely for Wikipedia Views, moving them off will help)
  • See if it's possible to set up MySQL for Wikipedia Views in Linode block storage. This will reduce the growth pressure on the main MySQL engine. I might also be interested in moving the whole MySQL over to block storage, if it can't be split between the two disks

Once things are moved to block storage, I will then downgrade the Linode instance appropriately. From a disk and RAM perspective, I believe that the solution using $80/month + $20 for backups should be good enough. However, it might be worth invesstigating if we can safely go down to $40/month + $10 for backups.

Alternate option: downgrade first, switch to block storage later?

It may be possible to downgrade to $80/month relatively quickly, because I am not currently using even the full 320 GB of storage. However, if I do this downgrade before switching to block storage, there is a risk of needing to figure out block storage in a few months, which can be tricky.

@vipulnaik
Copy link
Owner Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant