Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
d595505
make arch ready
tevko May 6, 2025
6f810c0
Merge branch 'colinmegill/node-delphi' into te-node-delphi-arch-changes
tevko May 6, 2025
6f8136a
begin splitting up cdk code
tevko May 7, 2025
a602164
organize cdk
tevko May 7, 2025
880cd43
cdk efs and permissions fixes
tevko May 8, 2025
ef99be6
docker fixes, add block storage
tevko May 8, 2025
501e027
remove default dynamo endpoint
tevko May 9, 2025
6b9fe93
endpoint check fix
tevko May 9, 2025
ee0ade3
use us-east-1
tevko May 9, 2025
c416d29
remove more hardcoded values
tevko May 9, 2025
b44422e
adjust delphi dockerfile
tevko May 9, 2025
5502c0e
fix up setupdynamo func
tevko May 9, 2025
a440343
fix reference
tevko May 9, 2025
d9409d3
add ssl mode hardcoded
tevko May 9, 2025
bbaf00d
another place to add secure mode
tevko May 10, 2025
ecdae20
ssl fix
tevko May 10, 2025
efdce63
update iam roles
tevko May 10, 2025
c52c5bf
remove hardcoded minio from docker
tevko May 10, 2025
8398905
remove s3 check
tevko May 10, 2025
4959ce9
change s3 endpoint to none if missingf
tevko May 11, 2025
c837348
more s3 config
tevko May 12, 2025
062eab7
fix indentation
tevko May 13, 2025
be5f941
s3 logic fix
tevko May 13, 2025
a197a43
indentation fix
tevko May 13, 2025
b5aee63
remove env check
tevko May 13, 2025
279c34b
comment out s3 creds
tevko May 13, 2025
5852ea5
remove location constraint
tevko May 13, 2025
bdafb6b
unique bucket name
tevko May 13, 2025
8571b5e
remove public read on aws
tevko May 13, 2025
7f8291f
fix comment
tevko May 13, 2025
99f6d12
endpoint urls in s3
tevko May 13, 2025
0fb27b7
remove another dynamo default
tevko May 13, 2025
dd1f063
more endpoint config
tevko May 14, 2025
774b3c3
Merge branch 'colinmegill/node-delphi' into te-node-delphi-arch-changes
tevko May 16, 2025
cf247a3
lint and test cleanup
tevko May 16, 2025
7ebda5c
more lint fix
tevko May 16, 2025
3d7bf40
defaults fixes
tevko May 16, 2025
271b53b
tsc fix
tevko May 19, 2025
87f1130
Merge branch 'colinmegill/node-delphi' into te-node-delphi-arch-changes
tevko May 19, 2025
b1fe80f
Merge branch 'colinmegill/node-delphi' into te-node-delphi-arch-changes
tevko May 19, 2025
d127c9c
attempt free up disk space
tevko May 19, 2025
f2ae529
multi stage builds
tevko May 20, 2025
443a7c6
Merge branch 'colinmegill/node-delphi' into te-node-delphi-arch-changes
tevko May 20, 2025
8417d1d
more space optimization
tevko May 20, 2025
70f30ac
more optimization
tevko May 20, 2025
b9afc04
debug
tevko May 20, 2025
c187a8b
remove commented out docker code, readme and makefile updates
tevko May 20, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion .github/workflows/cypress-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,25 @@ jobs:
- name: Checkout
uses: actions/checkout@v4

- name: Clean up runner space (Targeted)
run: |
echo "Initial disk space (before cleanup):"
df -h
echo "Removing large pre-installed software..."
sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/share/boost "$AGENT_TOOLSDIRECTORY" /opt/hostedtoolcache /usr/local/lib/android/* || echo "Some paths not found or removal failed, continuing."
echo "Cleaning apt cache..."
sudo apt-get clean -y || echo "apt-get clean failed"
echo "Pruning Docker system..."
docker system prune -af --volumes || echo "docker system prune failed"
echo "Disk space after targeted cleanup:"
df -h

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Build and start Docker containers
run: |
docker compose -f docker-compose.yml -f docker-compose.test.yml --env-file test.env --profile postgres up -d --build
docker compose -f docker-compose.yml -f docker-compose.test.yml --env-file test.env --profile postgres --profile local-services up -d --build

- name: Health Check the Server http response
uses: jtalk/url-health-check-action@v4
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ DETACH ?= false
DETACH_ARG = $(if $(filter true,$(DETACH)),-d,)

# Default compose file args
export COMPOSE_FILE_ARGS = -f docker-compose.yml -f docker-compose.dev.yml
export COMPOSE_FILE_ARGS = -f docker-compose.yml -f docker-compose.dev.yml --profile local-services
COMPOSE_FILE_ARGS += $(if $(POSTGRES_DOCKER),--profile postgres,)

# Set up environment-specific values
Expand Down
32 changes: 24 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ If you're trying to set up a Polis deployment or development environment, then p
Polis comes with Docker infrastructure for running a complete system, whether for a [production deployment](#-production-deployment) or a [development environment](#-development-tooling) (details for each can be found in later sections of this document).
As a consequence, the only prerequisite to running Polis is that you install a recent `docker` (and Docker Desktop if you are on Mac or Windows).

If you aren't able to use Docker for some reason, the various Dockerfiles found in subdirectories (`math`, `server`, `*-client`) of this repository _can_ be used as a reference for how you'd set up a system manually.
If you aren't able to use Docker for some reason, the various Dockerfiles found in subdirectories (`math`, `server`, `delphi`, `*-client`) of this repository _can_ be used as a reference for how you'd set up a system manually.
If you're interested in doing the legwork to support alternative infrastructure, please [let us know in an issue](https://github.com/compdemocracy.org/issues).

### Quick Start
Expand Down Expand Up @@ -79,7 +79,7 @@ cp example.env .env


```sh
docker compose --profile postgres up --build
docker compose --profile postgres --profile local-services up --build
```

If you get a permission error, try running this command with `sudo`.
Expand All @@ -89,7 +89,7 @@ To avoid having to use `sudo` in the future (on a Linux or Windows machine with
Once you've built the docker images, you can run without `--build`, which may be faster. Run

```sh
docker compose --profile postgres up
docker compose --profile postgres --profile local-services up
```

or simply
Expand All @@ -105,14 +105,14 @@ If you have only changed configuration values in .env, you can recreate your con
fully rebuilding them with `--force-recreate`. For example:

```sh
docker compose --profile postgres down
docker compose --profile postgres up --force-recreate
docker compose --profile postgres --profile local-services down
docker compose --profile postgres --profile local-services up --force-recreate
```

To see what the environment of your containers is going to look like, run:

```sh
docker compose --profile postgres convert
docker compose --profile postgres --profile local-services convert
```

#### Using a local or remote (non-docker) database
Expand All @@ -139,14 +139,30 @@ make PROD start
make PROD start-rebuild
```

### Running without Local Cloud Service Emulators
If you want to run the stack without the local MinIO and DynamoDB services (e.g., to test connecting to real AWS services configured in your .env file), simply omit the --profile local-services flag.

Example: Run with the containerized DB but connect to external/real cloud services:

```sh
docker compose --profile postgres up
```

Example: Run with an external DB and external/real cloud services (closest to production):

```sh
docker compose up
```


### Testing out your instance

You can now test your setup by visiting `http://localhost:80/home`.

Once the index page loads, you can create an account using the `/createuser` path.
You'll be logged in right away; email validation is not required.

When you're done working, you can end the process using `Ctrl+C`, or typing `docker compose --profile postgres down`
When you're done working, you can end the process using `Ctrl+C`, or typing `docker compose --profile postgres --profile local-services down`
if you are running in "detached mode".

### Updating the system
Expand Down Expand Up @@ -227,7 +243,7 @@ git config --local include.path ../.gitconfig

#### Running as a background process

If you would like to run docker compose as a background process, run the `up` commands with the `--detach` flag, and use `docker compose --profile postgres down` to stop.
If you would like to run docker compose as a background process, run the `up` commands with the `--detach` flag, and use `docker compose --profile postgres --profile local-services down` to stop.

#### Using Docker Machine as your development environment

Expand Down
153 changes: 153 additions & 0 deletions cdk/autoscaling.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@

import { Construct } from "constructs";
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as autoscaling from 'aws-cdk-lib/aws-autoscaling';
import * as cdk from 'aws-cdk-lib';
import * as cloudwatch from 'aws-cdk-lib/aws-cloudwatch';
import * as cloudwatch_actions from 'aws-cdk-lib/aws-cloudwatch-actions';

export default (
self: Construct,
vpc: cdk.aws_ec2.Vpc,
instanceRole: cdk.aws_iam.Role,
ollamaLaunchTemplate: cdk.aws_ec2.LaunchTemplate,
logGroup: cdk.aws_logs.LogGroup,
fileSystem: cdk.aws_efs.FileSystem,
webLaunchTemplate: cdk.aws_ec2.LaunchTemplate,
mathWorkerLaunchTemplate: cdk.aws_ec2.LaunchTemplate,
delphiSmallLaunchTemplate: cdk.aws_ec2.LaunchTemplate,
delphiLargeLaunchTemplate: cdk.aws_ec2.LaunchTemplate,
ollamaNamespace: string,
alarmTopic: cdk.aws_sns.Topic
) => {
const commonAsgProps = { vpc, role: instanceRole };

// Ollama ASG
const asgOllama = new autoscaling.AutoScalingGroup(self, 'AsgOllama', {
vpc,
launchTemplate: ollamaLaunchTemplate,
minCapacity: 1,
maxCapacity: 3,
desiredCapacity: 1,
vpcSubnets: { subnetGroupName: 'PrivateWithEgress' },
healthCheck: autoscaling.HealthCheck.ec2({ grace: cdk.Duration.minutes(10) }),
});
asgOllama.node.addDependency(logGroup);
asgOllama.node.addDependency(fileSystem); // Ensure EFS is ready before instances start

// Web ASG
const asgWeb = new autoscaling.AutoScalingGroup(self, 'Asg', {
vpc,
launchTemplate: webLaunchTemplate,
minCapacity: 2,
maxCapacity: 10,
desiredCapacity: 2,
vpcSubnets: { subnetType: ec2.SubnetType.PUBLIC },
healthCheck: autoscaling.HealthCheck.elb({grace: cdk.Duration.minutes(5)})
});

// Math Worker ASG
const asgMathWorker = new autoscaling.AutoScalingGroup(self, 'AsgMathWorker', {
vpc,
launchTemplate: mathWorkerLaunchTemplate,
minCapacity: 1,
desiredCapacity: 1,
maxCapacity: 5,
vpcSubnets: { subnetType: ec2.SubnetType.PUBLIC },
healthCheck: autoscaling.HealthCheck.ec2({ grace: cdk.Duration.minutes(2) }),
});

// Delphi Small ASG
const asgDelphiSmall = new autoscaling.AutoScalingGroup(self, 'AsgDelphiSmall', {
vpc,
launchTemplate: delphiSmallLaunchTemplate,
minCapacity: 1,
desiredCapacity: 1,
maxCapacity: 5,
vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
healthCheck: autoscaling.HealthCheck.ec2({ grace: cdk.Duration.minutes(5) }),
});

// Delphi Large ASG
const asgDelphiLarge = new autoscaling.AutoScalingGroup(self, 'AsgDelphiLarge', {
vpc,
launchTemplate: delphiLargeLaunchTemplate,
minCapacity: 1,
desiredCapacity: 1,
maxCapacity: 3,
vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
healthCheck: autoscaling.HealthCheck.ec2({ grace: cdk.Duration.minutes(5) }),
});


// --- Scaling Policies & Alarms
const mathWorkerCpuMetric = new cloudwatch.Metric({
namespace: 'AWS/EC2',
metricName: 'CPUUtilization',
dimensionsMap: {
AutoScalingGroupName: asgMathWorker.autoScalingGroupName
},
statistic: 'Average',
period: cdk.Duration.minutes(10),
});
asgMathWorker.scaleToTrackMetric('CpuTracking', {
metric: mathWorkerCpuMetric,
targetValue: 50,
});

// Add Delphi CPU Scaling Policies & Alarms
const createDelphiCpuScaling = (asg: autoscaling.AutoScalingGroup, name: string, target: number): cloudwatch.Metric => {
const cpuMetric = new cloudwatch.Metric({
namespace: 'AWS/EC2',
metricName: 'CPUUtilization',
dimensionsMap: { AutoScalingGroupName: asg.autoScalingGroupName },
statistic: 'Average',
period: cdk.Duration.minutes(5),
});
asg.scaleToTrackMetric(`${name}CpuTracking`, {
metric: cpuMetric,
targetValue: target
});

// High CPU Alarm
const alarm = new cloudwatch.Alarm(self, `${name}HighCpuAlarm`, {
metric: cpuMetric,
threshold: 80, // Alert if CPU > 80%
evaluationPeriods: 2, // for 2 consecutive periods (10 minutes total)
datapointsToAlarm: 2, // Ensure 2 datapoints are breaching
comparisonOperator: cloudwatch.ComparisonOperator.GREATER_THAN_THRESHOLD,
alarmDescription: `Alert when ${name} instances CPU exceeds 80% for 10 minutes`,
treatMissingData: cloudwatch.TreatMissingData.IGNORE, // Or BREACHING/NOT_BREACHING as appropriate
});
// Add SNS action to the alarm
alarm.addAlarmAction(new cloudwatch_actions.SnsAction(alarmTopic));
return cpuMetric;
};
const delphiSmallCpuMetric = createDelphiCpuScaling(asgDelphiSmall, 'DelphiSmall', 60); // Target 60% CPU
const delphiLargeCpuMetric = createDelphiCpuScaling(asgDelphiLarge, 'DelphiLarge', 60); // Target 60% CPU

// Add Ollama GPU Scaling Policy
const ollamaGpuMetric = new cloudwatch.Metric({
namespace: ollamaNamespace, // Custom namespace from CW Agent config
metricName: 'utilization_gpu', // GPU utilization metric name from CW Agent config
dimensionsMap: { AutoScalingGroupName: asgOllama.autoScalingGroupName },
statistic: 'Average',
period: cdk.Duration.minutes(1),
});
asgOllama.scaleToTrackMetric('OllamaGpuScaling', {
metric: ollamaGpuMetric,
targetValue: 75,
cooldown: cdk.Duration.minutes(5), // Prevent flapping
disableScaleIn: false, // Allow scaling down
estimatedInstanceWarmup: cdk.Duration.minutes(5), // Time until instance contributes metrics meaningfully
});

return {
asgOllama,
asgWeb,
asgMathWorker,
asgDelphiSmall,
asgDelphiLarge,
commonAsgProps
}
}
44 changes: 44 additions & 0 deletions cdk/codedeploy.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import { Construct } from "constructs";
import * as cdk from 'aws-cdk-lib';
import * as codedeploy from 'aws-cdk-lib/aws-codedeploy';
import * as s3 from 'aws-cdk-lib/aws-s3';

export default (
self: Construct,
instanceRole: cdk.aws_iam.Role,
asgWeb: cdk.aws_autoscaling.AutoScalingGroup,
asgMathWorker: cdk.aws_autoscaling.AutoScalingGroup,
asgDelphiSmall: cdk.aws_autoscaling.AutoScalingGroup,
asgDelphiLarge: cdk.aws_autoscaling.AutoScalingGroup,
codeDeployRole: cdk.aws_iam.Role
) => {
const application = new codedeploy.ServerApplication(self, 'CodeDeployApplication', {
applicationName: 'PolisApplication',
});

const deploymentBucket = new s3.Bucket(self, 'DeploymentPackageBucket', {
bucketName: `polis-deployment-packages-${cdk.Stack.of(self).account}-${cdk.Stack.of(self).region}`,
removalPolicy: cdk.RemovalPolicy.DESTROY,
autoDeleteObjects: true,
versioned: true,
publicReadAccess: false,
blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
});
deploymentBucket.grantRead(instanceRole);

// Deployment Group
const deploymentGroup = new codedeploy.ServerDeploymentGroup(self, 'DeploymentGroup', {
application,
deploymentGroupName: 'PolisDeploymentGroup',
autoScalingGroups: [asgWeb, asgMathWorker, asgDelphiSmall, asgDelphiLarge],
deploymentConfig: codedeploy.ServerDeploymentConfig.ONE_AT_A_TIME,
role: codeDeployRole,
installAgent: true,
});

return {
application,
deploymentBucket,
deploymentGroup
}
}
37 changes: 37 additions & 0 deletions cdk/config/amazon-cloudwatch-agent.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
{
"agent": { "metrics_collection_interval": 60, "run_as_user": "root" },
"metrics": {
"append_dimensions": {
"AutoScalingGroupName": "${aws:AutoScalingGroupName}",
"ImageId": "${aws:ImageId}",
"InstanceId": "${aws:InstanceId}",
"InstanceType": "${aws:InstanceType}"
},
"metrics_collected": {
"nvidia_gpu": {
"measurement": [
{"name": "utilization_gpu", "unit": "Percent"},
{"name": "utilization_memory", "unit": "Percent"},
{"name": "memory_total", "unit": "Megabytes"},
{"name": "memory_used", "unit": "Megabytes"},
{"name": "memory_free", "unit": "Megabytes"},
{"name": "power_draw", "unit": "Watts"},
{"name": "temperature_gpu", "unit": "Count"}
],
"metrics_collection_interval": 60,
"nvidia_smi_path": "/usr/bin/nvidia-smi",
"metrics_aggregation_interval": 60,
"namespace": "OllamaMetrics"
},
"disk": {
"measurement": [ "used_percent" ],
"metrics_collection_interval": 60,
"resources": [ "/" ]
},
"mem": {
"measurement": [ "mem_used_percent" ],
"metrics_collection_interval": 60
}
}
}
}
Loading
Loading