Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
7b67381
Fix rendering salt on distributed
NathanFlurry Aug 4, 2023
b01023f
Remove old config dir
NathanFlurry Aug 4, 2023
8f4ea66
Remove redis-search
NathanFlurry Aug 4, 2023
dd304e2
Fix optional secret issue
NathanFlurry Aug 4, 2023
c89139e
Fix getter source URL
NathanFlurry Aug 4, 2023
a40781b
Update changelog
NathanFlurry Aug 4, 2023
5e6b2db
Capture events in bg
NathanFlurry Aug 4, 2023
3b0dfb0
Fix importing secrets in sls
NathanFlurry Aug 4, 2023
f5740f0
Simplify specifying roles
NathanFlurry Aug 4, 2023
a4c3519
Clean up Traefik service configs to allow for multiple Traefik services
NathanFlurry Aug 4, 2023
3686de4
Update assets.rivet.gg
NathanFlurry Aug 4, 2023
fc82c3a
SImplfiy singleton handling for services
NathanFlurry Aug 4, 2023
bacb241
Fix deploying EE APIs
NathanFlurry Aug 4, 2023
f0ce792
Fix ingress SaltStack
NathanFlurry Aug 4, 2023
c840870
Update admin auth
NathanFlurry Aug 4, 2023
a322907
Add troubleshooting for failed minion
NathanFlurry Aug 5, 2023
ce3a805
Add start-at flag for infra commands
NathanFlurry Aug 5, 2023
d9efde4
Auto install rsync on Salt Master
NathanFlurry Aug 5, 2023
6c6dffd
WIP getting ATS working again
NathanFlurry Aug 5, 2023
aa6aa8f
Fix trafficserver config
NathanFlurry Aug 5, 2023
909b605
Add B2 Nomad artifact support
NathanFlurry Aug 5, 2023
06f7595
Allow toggling multipart uploads
NathanFlurry Aug 5, 2023
2dcc595
Remove todo!
NathanFlurry Aug 5, 2023
a1a99c0
Fix module version set
NathanFlurry Aug 5, 2023
ead70f6
Remove ats build
NathanFlurry Aug 5, 2023
9611b4d
Fix internal dashboard docs
NathanFlurry Aug 5, 2023
d2da6d6
Fix invalid message parameters
NathanFlurry Aug 5, 2023
13aa561
Increate kv write limit
NathanFlurry Aug 5, 2023
0abb8db
Add idle lobbies to faker configs to prevent race condition in tests
NathanFlurry Aug 6, 2023
a29e551
Fix panic with missing api-party URL
NathanFlurry Aug 6, 2023
ceb67e8
Add override dependencies & recurse dependency list
NathanFlurry Aug 6, 2023
1026c40
Fix Cargo dependencies
NathanFlurry Aug 6, 2023
2cac855
Fix activities
NathanFlurry Aug 6, 2023
2af1141
Fix developer link
NathanFlurry Aug 6, 2023
761c110
Update link addr
NathanFlurry Aug 6, 2023
f5e452d
update migration script
NathanFlurry Aug 6, 2023
885a589
Add bolt headless reading secrets
NathanFlurry Aug 6, 2023
04f9184
B2 file lock hotfix
NathanFlurry Aug 6, 2023
e4dc7bd
Validate dir exists before mounting
NathanFlurry Aug 6, 2023
dc07c5c
Fix get_secret.sh script to use Bash
NathanFlurry Aug 6, 2023
4a0e20c
ATS dirs
NathanFlurry Aug 6, 2023
d5799a6
Change default user searchable
NathanFlurry Aug 6, 2023
a7c616f
Fix worker threads
NathanFlurry Aug 6, 2023
10b3f7d
Add list migrations
NathanFlurry Aug 6, 2023
56cecdc
Enable Tokio Console
NathanFlurry Aug 6, 2023
a1b103d
Update Tokio
NathanFlurry Aug 6, 2023
365d3ae
Add NATS docs
NathanFlurry Aug 7, 2023
a228940
Fix misconfigured Nomad dynamic firewall
NathanFlurry Aug 7, 2023
b67f5f9
Update prod Rust version
NathanFlurry Aug 7, 2023
2ec3fbf
Remove IPv6 inbound for Nebula
NathanFlurry Aug 7, 2023
0520bde
Update async-nats to 0.30
NathanFlurry Aug 7, 2023
ce1efaa
WIP
MasterPtato Aug 7, 2023
2330d5d
Merge remote-tracking branch 'origin/nathan/hotfix' into max/SVC-2900
MasterPtato Aug 7, 2023
8fb1988
WIP trying to fix stuff
NathanFlurry Aug 7, 2023
73ca3d5
WIP
MasterPtato Aug 9, 2023
f31c699
Merge remote-tracking branch 'origin/nathan/hotfix-2' into max/SVC-2900
MasterPtato Aug 10, 2023
99db869
Get new s3 routing working
MasterPtato Aug 10, 2023
c6911f5
Get jobs working through ATS
MasterPtato Aug 11, 2023
eae7fa9
Fix media routing
MasterPtato Aug 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
16 changes: 14 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- **Cloud** Support multipart uploads for builds
- **Infra** Support configuring multiple S3 providers
- **Infra** Support multipart uploads
- **Infra** Replace Promtail-based log shipping with native Loki Docker driver
- **Infra** Add local Traefik Cloudflare proxy daemon for connecting to Cloudflare Access services
- **Infra** Local Traefik Cloudflare proxy daemon for connecting to Cloudflare Access services
- **Infra** Upload service builds to default S3 provider instead of hardcoded bucket
- **Bolt** Support for connecting to Redis databases with `bolt redis sh`
- **Bolt** Add confirmation before running any command in the production namespace
- **Bolt** Confirmation before running any command in the production namespace
- **Bolt** `--start-at` flag for all infra commands

### Changed

Expand All @@ -31,6 +34,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **Infra** Update Redis Exporter to 1.52.0
- **Infra** Update Redis to 7.0.12
- **Infra** Update Treafik to 2.10.4
- **Bolt** PostHog events are now captured in a background task
- **Bolt** Auto-install rsync on Salt Master
- **Bolt** Recursively add dependencies from overridden services when using additional roots
- **KV** Significantly rate limit of all endpoints

### Security

Expand All @@ -40,4 +47,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- **Portal** Skip captcha if no Turnstile key provided
- **Infra** Resolve [RUSTSEC-2023-0044](https://rustsec.org/advisories/RUSTSEC-2023-0044)
- **Infra** Missing dpenedency on mounting volumn before setting permissions of /var/* for Cockroach, ClickHouse, Prometheus, and Traffic Server
- **Chrip** Empty message parameters now have placeholder so NATS doesn't throw an error
- **Chrip** Messages with no parameters no longer have a trailing dot
- **Bolt** Correctly resolve project root when building services natively
- **Bolt** Correctly determine executable path for `ExecServiceDriver::UploadedBinaryArtifact` with different Cargo names

18 changes: 9 additions & 9 deletions docs/getting_started/INTERNAL_DASHBOARDS.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@ Exposed tunnels & applications are configured [here](/lib/bolt/core/src/dep/terr

Replace `MAIN_DOMAIN` with the value of `dns.domain.main`.

- [Consul](https://consul.MAIN_DOMAIN))
- [Nomad](https://nomad.MAIN_DOMAIN))
- [Cockroach](https://cockroach-http.MAIN_DOMAIN))
- [ClickHouse](https://clickhouse-http.MAIN_DOMAIN))
- [Prometheus (svc)](https://prometheus-svc.MAIN_DOMAIN))
- [Prometheus (job)](https://prometheus-job.MAIN_DOMAIN))
- [Minio](https://minio-console.MAIN_DOMAIN))
- [Traefik (proxied)](https://ing-px.MAIN_DOMAIN))
- [Traefik (job)](https://ing-job.MAIN_DOMAIN))
- [Consul](https://consul.MAIN_DOMAIN)
- [Nomad](https://nomad.MAIN_DOMAIN)
- [Cockroach](https://cockroach-http.MAIN_DOMAIN)
- [ClickHouse](https://clickhouse-http.MAIN_DOMAIN)
- [Prometheus (svc)](https://prometheus-svc.MAIN_DOMAIN)
- [Prometheus (job)](https://prometheus-job.MAIN_DOMAIN)
- [Minio](https://minio-console.MAIN_DOMAIN)
- [Traefik (proxied)](https://ing-px.MAIN_DOMAIN)
- [Traefik (job)](https://ing-job.MAIN_DOMAIN)
- This does not support regional dashboards at the moment (SVC-2584)
- Will choose a random region until fixed
10 changes: 10 additions & 0 deletions docs/infrastructure/nats/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Troubleshooting

## Checking the health of the cluster manually...

1. `bolt ssh pool nats`
2. `nix-shell -p natscli`
3. `nats --server=10.0.44.2:4222 --user admin --password password context save default`
4. `nats context select default`
5. `nats server report connections`

12 changes: 12 additions & 0 deletions docs/infrastructure/saltstack/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,15 @@ Try a few things to figure this out:
- Run `pstree -p my-pid` on the `salt-minion` process to see what subcommand is being ran
- Read the `salt-minion` logs with `journalctl -u salt-minion`
- Try applying specific SLS files with `salt apply 'my-minion' --sls my_file`

## Error when bootstrapping Minion: `RSA key format is not supported`

```bash
# Uninstall Salt
bolt ssh name staging-lnd-atl-crdb-05-2 'systemctl stop salt-minion; apt remove -y salt-cloud salt-common salt-minion; rm -rf /etc/salt /opt/saltstack /var/log/salt /var/cache/salt /run/salt /usr/bin/salt-*; echo Done'

# Re-run install_salt_minion
(cd infra/tf/pools && terraform state rm 'module.install_salt_minion["staging-lnd-atl-crdb-05-2"]')
bolt tf apply pools
```

10 changes: 5 additions & 5 deletions fern/api/definition/cloud/games/builds.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,15 +44,15 @@ types:
docs: A tag given to the game build.
type: string
image_file: uploadCommons.PrepareFile
multipart_upload:
type: optional<boolean>

CreateGameBuildResponse:
properties:
build_id:
type: uuid
upload_id:
type: uuid
image_presigned_request:
docs: >-
**Deprecated: use image_presigned_requests instead**
type: uploadCommons.PresignedRequest
image_presigned_requests: list<uploadCommons.PresignedRequest>
image_presigned_request: optional<uploadCommons.PresignedRequest>
image_presigned_requests: optional<list<uploadCommons.PresignedRequest>>

5 changes: 2 additions & 3 deletions gen/openapi/external/spec/openapi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9882,6 +9882,8 @@ components:
description: A tag given to the game build.
image_file:
$ref: '#/components/schemas/UploadPrepareFile'
multipart_upload:
type: boolean
required:
- display_name
- image_tag
Expand All @@ -9897,16 +9899,13 @@ components:
format: uuid
image_presigned_request:
$ref: '#/components/schemas/UploadPresignedRequest'
description: '**Deprecated: use image_presigned_requests instead**'
image_presigned_requests:
type: array
items:
$ref: '#/components/schemas/UploadPresignedRequest'
required:
- build_id
- upload_id
- image_presigned_request
- image_presigned_requests
CloudGamesListGameCdnSitesResponse:
type: object
properties:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Name | Type | Description | Notes
**display_name** | **String** | Represent a resource's readable display name. |
**image_file** | [**crate::models::UploadPrepareFile**](UploadPrepareFile.md) | |
**image_tag** | **String** | A tag given to the game build. |
**multipart_upload** | Option<**bool**> | | [optional]

[[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
Name | Type | Description | Notes
------------ | ------------- | ------------- | -------------
**build_id** | [**uuid::Uuid**](uuid::Uuid.md) | |
**image_presigned_request** | [**crate::models::UploadPresignedRequest**](UploadPresignedRequest.md) | |
**image_presigned_requests** | [**Vec<crate::models::UploadPresignedRequest>**](UploadPresignedRequest.md) | |
**image_presigned_request** | Option<[**crate::models::UploadPresignedRequest**](UploadPresignedRequest.md)> | | [optional]
**image_presigned_requests** | Option<[**Vec<crate::models::UploadPresignedRequest>**](UploadPresignedRequest.md)> | | [optional]
**upload_id** | [**uuid::Uuid**](uuid::Uuid.md) | |

[[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ pub struct CloudGamesCreateGameBuildRequest {
/// A tag given to the game build.
#[serde(rename = "image_tag")]
pub image_tag: String,
#[serde(rename = "multipart_upload", skip_serializing_if = "Option::is_none")]
pub multipart_upload: Option<bool>,
}

impl CloudGamesCreateGameBuildRequest {
Expand All @@ -29,6 +31,7 @@ impl CloudGamesCreateGameBuildRequest {
display_name,
image_file: Box::new(image_file),
image_tag,
multipart_upload: None,
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,20 +15,20 @@
pub struct CloudGamesCreateGameBuildResponse {
#[serde(rename = "build_id")]
pub build_id: uuid::Uuid,
#[serde(rename = "image_presigned_request")]
pub image_presigned_request: Box<crate::models::UploadPresignedRequest>,
#[serde(rename = "image_presigned_requests")]
pub image_presigned_requests: Vec<crate::models::UploadPresignedRequest>,
#[serde(rename = "image_presigned_request", skip_serializing_if = "Option::is_none")]
pub image_presigned_request: Option<Box<crate::models::UploadPresignedRequest>>,
#[serde(rename = "image_presigned_requests", skip_serializing_if = "Option::is_none")]
pub image_presigned_requests: Option<Vec<crate::models::UploadPresignedRequest>>,
#[serde(rename = "upload_id")]
pub upload_id: uuid::Uuid,
}

impl CloudGamesCreateGameBuildResponse {
pub fn new(build_id: uuid::Uuid, image_presigned_request: crate::models::UploadPresignedRequest, image_presigned_requests: Vec<crate::models::UploadPresignedRequest>, upload_id: uuid::Uuid) -> CloudGamesCreateGameBuildResponse {
pub fn new(build_id: uuid::Uuid, upload_id: uuid::Uuid) -> CloudGamesCreateGameBuildResponse {
CloudGamesCreateGameBuildResponse {
build_id,
image_presigned_request: Box::new(image_presigned_request),
image_presigned_requests,
image_presigned_request: None,
image_presigned_requests: None,
upload_id,
}
}
Expand Down
5 changes: 2 additions & 3 deletions gen/openapi/internal/spec/openapi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10927,6 +10927,8 @@ components:
description: A tag given to the game build.
image_file:
$ref: '#/components/schemas/UploadPrepareFile'
multipart_upload:
type: boolean
required:
- display_name
- image_tag
Expand All @@ -10942,16 +10944,13 @@ components:
format: uuid
image_presigned_request:
$ref: '#/components/schemas/UploadPresignedRequest'
description: '**Deprecated: use image_presigned_requests instead**'
image_presigned_requests:
type: array
items:
$ref: '#/components/schemas/UploadPresignedRequest'
required:
- build_id
- upload_id
- image_presigned_request
- image_presigned_requests
CloudGamesListGameCdnSitesResponse:
type: object
properties:
Expand Down
5 changes: 2 additions & 3 deletions gen/openapi/internal/spec_compat/openapi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -756,6 +756,8 @@ components:
image_tag:
description: A tag given to the game build.
type: string
multipart_upload:
type: boolean
required:
- display_name
- image_tag
Expand All @@ -768,7 +770,6 @@ components:
type: string
image_presigned_request:
$ref: '#/components/schemas/UploadPresignedRequest'
description: '**Deprecated: use image_presigned_requests instead**'
image_presigned_requests:
items:
$ref: '#/components/schemas/UploadPresignedRequest'
Expand All @@ -779,8 +780,6 @@ components:
required:
- build_id
- upload_id
- image_presigned_request
- image_presigned_requests
type: object
CloudGamesCreateGameCdnSiteRequest:
properties:
Expand Down
3 changes: 3 additions & 0 deletions infra/salt/pillar/ats/init.sls
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{% import_json "/srv/salt-context/rivet/secrets.json" as rivet_secrets %}

ats: {{ rivet_secrets['ats'] }}
3 changes: 2 additions & 1 deletion infra/salt/pillar/top.sls
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,14 @@ base:
- nomad
'G@roles:traffic-server':
- s3
- ats
'G@roles:traefik':
- tls
- api-route
'G@roles:clickhouse':
- clickhouse
'G@roles:minio':
- minio
'G@roles:traefik-cloudflare-proxy':
'G@roles:cloudflare-proxy':
- cloudflare

2 changes: 1 addition & 1 deletion infra/salt/salt/clickhouse/files/clickhouse-server.service
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ ExecStart=/var/rivet-nix/result/clickhouse/bin/clickhouse server --config=/etc/c
LimitCORE=infinity
LimitNOFILE=500000
CapabilityBoundingSet=CAP_NET_ADMIN CAP_IPC_LOCK CAP_SYS_NICE CAP_NET_BIND_SERVICE
{% if pillar['rivet']['deploy']['local']['restrict_service_resources'] %}
{% if 'local' in pillar.rivet.deploy and pillar.rivet.deploy.local.restrict_service_resources %}
Nice=10
CPUAffinity=0
{% endif %}
Expand Down
46 changes: 29 additions & 17 deletions infra/salt/salt/clickhouse/init.sls
Original file line number Diff line number Diff line change
@@ -1,3 +1,23 @@
{% if grains['volumes']['ch']['mount'] %}
{% set device = '/dev/disk/by-id/scsi-0Linode_Volume_' ~ grains['rivet']['name'] ~ '-ch' %}

disk_create_clickhouse:
blockdev.formatted:
- name: {{ device }}
- fs_type: ext4

disk_mount_clickhouse:
file.directory:
- name: /var/lib/clickhouse
- makedirs: True
mount.mounted:
- name: /var/lib/clickhouse
- device: {{ device }}
- fstype: ext4
- require:
- blockdev: disk_create_clickhouse
{% endif %}

create_clickhouse_user:
user.present:
- name: clickhouse
Expand Down Expand Up @@ -26,24 +46,16 @@ mkdir_clickhouse:
- mode: 700
- require:
- user: create_clickhouse_user
{%- if grains['volumes']['ch']['mount'] %}
- mount: disk_mount_clickhouse
{%- endif %}

{% if grains['volumes']['ch']['mount'] %}
{% set device = '/dev/disk/by-id/scsi-0Linode_Volume_' ~ grains['rivet']['name'] ~ '-ch' %}

disk_create_clickhouse:
blockdev.formatted:
- name: {{ device }}
- fs_type: ext4

disk_mount_clickhouse:
mount.mounted:
- name: /var/lib/clickhouse
- device: {{ device }}
- fstype: ext4
- require:
- blockdev: disk_create_clickhouse
- file: mkdir_clickhouse
{% endif %}
# Remove old config directories with residual files
remove_etc_clickhouse_server_dirs:
file.absent:
- names:
- /etc/clickhouse-server/config.d
- /etc/clickhouse-server/users.d

push_etc_clickhouse_server:
file.managed:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ After=network-online.target
Wants=network-online.target systemd-networkd-wait-online.service

[Service]
User=traefik_cloudflare_proxy
Group=traefik_cloudflare_proxy
ExecStart=/usr/bin/traefik --configFile=/etc/traefik_cloudflare_proxy/traefik.toml
User=cloudflare_proxy
Group=cloudflare_proxy
ExecStart=/usr/bin/traefik --configFile=/etc/cloudflare_proxy/traefik.toml
PrivateTmp=true
PrivateDevices=false
ProtectHome=true
Expand Down
15 changes: 15 additions & 0 deletions infra/salt/salt/cloudflare_proxy/files/consul/cloudflare-proxy.hcl
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
services {
name = "cloudflare-proxy"
tags = ["traefik"]

port = 9060
checks = [
{
name = "Reachable on 9060"
tcp = "127.0.0.1:9060"
interval = "10s"
timeout = "1s"
}
]
}

Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@

[providers]
[providers.file]
directory = "/etc/traefik_cloudflare_proxy/dynamic"
directory = "/etc/cloudflare_proxy/dynamic"
Loading