Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS dies after a day; three things to fix #8

Closed
3 tasks done
cgoettel opened this issue Feb 12, 2021 · 6 comments · Fixed by #12
Closed
3 tasks done

AWS dies after a day; three things to fix #8

cgoettel opened this issue Feb 12, 2021 · 6 comments · Fixed by #12
Assignees
Labels
bug Something isn't working

Comments

@cgoettel
Copy link
Collaborator

cgoettel commented Feb 12, 2021

The current AWS instance is slow as rocks and fills up in about a day. Three things to get this fixed:

  • System requirements for OpenCTI say 6 CPU, 16GB RAM, and minimum 32GB disk.
  • Log rotation for elasticsearch (maybe others; implemented log rotate for journalctl (see Set hard limit for journald #4) and saw most recently that elasticsearch was filling up the disk)
  • Increase disk size to 32GB. This is a separate item because it's not controlled in the instance choice.
@cgoettel cgoettel added the bug Something isn't working label Feb 12, 2021
@cgoettel cgoettel self-assigned this Feb 12, 2021
@cgoettel
Copy link
Collaborator Author

Disk is EBS, not OS disk. FYI.

@cgoettel
Copy link
Collaborator Author

The code checked into cgoettel-expand-aws is ready to be tested. Notes for when I get to it on Monday:

  • Check how the EBS disk is mounted (mounted at /dev/sdf):
    • Do I need to add anything to fstab to get it working?
    • Where are the logs being written? Is that mounted on the new 32GB instance or is it all still going to the OS disk?
  • Made some changes to the variable location. Is that working?

@cgoettel
Copy link
Collaborator Author

Update from work and thoughts yesterday:

  • The EBS disk is not mounted. It is /dev/nvme1n1. Added code to the script to create a filesystem (if needed) and mount it. It's kinda dirty and will need to be reworked if another disk is added. Alsø did the fstab thing.
  • The logs are the real issue on this system. /usr and /var each take up like 40% of the space. Possible solutions (both good and bad (including, but not limited to, awful):
    • Create partitions on the EBS and mount /var and /usr to it. Maybe just /var because that's the one that will grow? Where do the connectors put their stuff? If it's on /usr, that's an issue.
    • Symbolic link /var and /usr to the new mount. Horrific idea. If the mount doesn't come up, the OS is wack.
    • Config changes for each of the applications to put their stuff on the new drive (or symbolic link the config directories, but then we're back to the previous problem). I think this is really the best solution. It sucks and it's a bunch of work (and have to keep these AWS-specific bits separate in the code from the Azure and GCP stuff).
  • Variable changes are working.

@cgoettel
Copy link
Collaborator Author

I have failed as an engineer and a researcher. Read one article saying that you can't increase the root volume's size and I took it as gospel. Moron. The extra EBS disk is no longer in the code, root volume increased to 32GB.

@cgoettel
Copy link
Collaborator Author

Changed the VPC and Subnet from being manual to being automated by Terraform. And now I can't access the instance via Systems Manager. Tryna figure out why.

@cgoettel
Copy link
Collaborator Author

I have exhausted myself trying to figure out why #9 is happening. Can't figure it out. Gone back to the previous code and calling this issue quits.

@cgoettel cgoettel mentioned this issue Feb 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant