Skip to content

Latest commit

 

History

History
67 lines (53 loc) · 2.95 KB

README.md

File metadata and controls

67 lines (53 loc) · 2.95 KB

smarta

scrapedumper is a data-dump tool that is currently coupled to a MARTA client.

It's primary purpose is to consume MARTA realtime data and upload it to various providers.

Continuous Integration Status

Continuous Integration status codecov

Why is this needed?

This allows people to build a historical dataset of MARTA arrival times. SMARTA plans to use this to provide a dataset for MARTA train forecasting.

How does it work?

The dumper.Dumper interface provides a Dump function that accepts a reader (and presumably dumps data somewhere).

Implementing this interface should allow an extensible way to Dump data wherever it is needed.

Project Goals

  • Allow upload to local directories
  • Allow upload to S3
  • Allow upload to Dynamo
  • Allow multiclient response handling for Dynamo handler
  • Use a Scraper interface instead of a coupling marta client to it
  • circuitbreaker in the worker?
  • backoff, jitter, retryer on marta client

Running

After running go build to obtain the binary, you can run the binary as long as you provide the required environment variables:

type options struct {
	OutputLocation    string `long:"output-location" env:"OUTPUT_LOCATION" description:"local path to output"`
	DynamoTableName   string `long:"dynamo-table-name" env:"DYNAMO_TABLE_NAME" description:"dynamo table name"`
	S3BucketName      string `long:"s3-bucket-name" env:"S3_BUCKET_NAME" description:"s3 bucket to dump stuff into"`
	MartaAPIKey       string `long:"marta-api-key" env:"MARTA_API_KEY" description:"marta api key" required:"true"`
	PollTimeInSeconds int    `long:"poll-time-in-seconds" env:"POLL_TIME_IN_SECONDS" description:"time to poll marta api every second" required:"true"`

	ConfigPath *string `long:"config-path" env:"CONFIG_PATH" description:"An optional file that overrides the default configuration of sources and targets."`
}

./scrapedumper --output-location=. --marta-api-key={{key}} --poll-time-in-seconds=15

Config Based Approach

{
	"bus_dumper": {
		"kind": "ROUND_ROBIN | FILE | S3 | DYNAMODB",
		"components": [],
		"s3_bucket_name": "",
		"dynamo_table_name": "",
		"local_output_location": ""
	},
	"train_dumper": {
		"kind": "ROUND_ROBIN | FILE | S3 | DYNAMODB",
		"components": [],
		"s3_bucket_name": "",
		"dynamo_table_name": "",
		"local_output_location": ""
	}
}

./scrapedumper --config-path=./config --marta-api-key={{key}} --poll-time-in-seconds=15