-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First steps to lambda deploy #13
Conversation
82ae813
to
3482e26
Compare
I also added the |
e5a3b7b
to
e1a9cc5
Compare
e1a9cc5
to
a625739
Compare
This is because requests_cache uses sqlite on disk, which won't be possible in aws_lambda. Ideally this will be re-instated to be used when running locally.
Makefile is mostly to create requirements.txt. sam-template.yaml defines a codecommit repository and a scraper queue The plan is to load scrapers into the queue with one lambda function then use the queue to trigger another lambda per scraper to actually run. Scraped data will then be committed to the code commit repo.
a625739
to
663aeb5
Compare
|
||
|
||
class BaseCouncillorScraper(ScraperBase): | ||
class BaseCouncillorScraper(CodeCommitMixin, ScraperBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit brittle, as the order matters here. But I guess that's just multiple inheritance.
|
||
requests_cache.install_cache("scraper_cache", expire_after=60 * 60 * 24) | ||
# import requests_cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented out because it should probably be reinstated behind a check for lambda env
@@ -97,3 +100,193 @@ def save_raw(self, filename, content): | |||
def save_json(self, obj): | |||
file_name = "{}.json".format(obj.as_file_name()) | |||
self._save_file("json", file_name, obj.as_json()) | |||
|
|||
|
|||
class CodeCommitMixin: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a stab at pulling out the codecommit logic into a mixin for the scrapers. Not sure it's the best way of doing it, and haven't made it clear what methods the child classes need to implement, but thought I would get a better handle on whether it was a working system when I do the polling station scrapers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems ok to me — the only other pattern we could investigate is the way Django does pluggable storage in some places: define a storage interface that is subclassed and then set that storage class in settings / globally / by some other logic.
So for example, we'd have
class LocalFileSystemStorage(Storage):
pass
class CodeCommitStorage(Storage):
pass
And then the BaseCouncillorScraper
could assign self.storage = CodeCommitStorage()
and later do self.storage.save()
or whatever.
Happy to talk more about this pattern if you think it's useful. Some more reading:
Notes on workflow
Login to aws sso cli
aws sso login --profile dc-lgsf-dev
Build
sam build --template sam-template.yaml
Test a function locally
sam local invoke ScraperWorkerFunction --event lgsf/aws_lambda/fixtures/sqs-message.json --profile dc-lgsf-dev
nb
sqs-message.json
adapted from output ofsam local generate-event sqs receive-message
Deploy to dev
sam deploy --profile dc-lgsf-dev
ToDo
(turn into issues)
self.options["aws_lambda"]