TheBlog should run periodically (via an Openwhisk trigger) and scan theblog.adobe.com to determine if new blog entries have been created. For each new blog entry detected, it invokes TheBlog Importer.
The execution flow looks like this:
- fetch the content of the theblog.adobe.com homepage
- compute the list of links on the page
- for each link, check if it present in a list of already processed urls stored in a OneDrive XLSX file (
/importer/urls.xlsx
) - if not present, invoke helix-theblog-importer action
It happens sometimes that the post entries published on theblog.adobe.com are corrupted and get fixed later. The scanner may have already detected and triggered the import of the corrupted version. To re-trigger the import, simply remove the entry from the /importer/urls.xlsx
file (delete row): if the blog entry is still visible on the homepage, it will be re-imported. If not, then you need to manual trigger the import: change the URL and run the test https://github.com/adobe/helix-theblog-importer/blob/master/test/index.test.js#L24.
Deploy the action:
npm run deploy
Create a five mins triggers:
wsk trigger create five-mins-trigger --feed /whisk.system/alarms/alarm --param cron "*/5 * * * *"
Link the trigger to a rule:
wsk rule update five-mins-scan five-mins-trigger helix-theblog/helix-theblog-scanner@latest
Connection to OneDrive:
AZURE_ONEDRIVE_CLIENT_ID
AZURE_ONEDRIVE_CLIENT_SECRET
AZURE_ONEDRIVE_REFRESH_TOKEN
OneDrive shared folder that contains the /importer/urls.xlsx
file:
AZURE_ONEDRIVE_ADMIN_LINK
Openwhish credentials to invoke the helix-theblog-importer action:
OPENWHISK_API_KEY
OPENWHISK_API_HOST
Coralogix credentials to log:
CORALOGIX_API_KEY
CORALOGIX_LOG_LEVEL
Deploying Helix Service requires the wsk
command line client, authenticated to a namespace of your choice. For Project Helix, we use the helix
namespace.
All commits to master that pass the testing will be deployed automatically. All commits to branches that will pass the testing will get commited as /helix-theblog/helix-theblog-scanner@ci<num>
and tagged with the CI build number.