Skip to content

feat(drs-client): add S3 upload and SNS publish for brand presence analysis#1553

Merged
irenelagno merged 7 commits intomainfrom
feat/drs-client-s3-sns-brand-presence
Apr 30, 2026
Merged

feat(drs-client): add S3 upload and SNS publish for brand presence analysis#1553
irenelagno merged 7 commits intomainfrom
feat/drs-client-s3-sns-brand-presence

Conversation

@irenelagno
Copy link
Copy Markdown
Contributor

@irenelagno irenelagno commented Apr 22, 2026

Summary

  • Adds uploadExcelToDrs() to DrsClient — uploads a brand presence Excel file directly to the DRS S3 bucket at external/spacecat/{siteId}/{brandSlug}/{jobId}/source.xlsx
  • Adds publishBrandPresenceAnalyze() to DrsClient — publishes a JOB_COMPLETED SNS event to DRS_SNS_TOPIC_ARN to trigger DRS Fargate analysis
  • Adds isS3Configured() guard method (checks DRS_S3_BUCKET + DRS_SNS_TOPIC_ARN)
  • Updates TypeScript declarations and adds @aws-sdk/client-s3 + @aws-sdk/client-sns dependencies
  • Adds implementation plan doc at docs/plans/2026-04-22-geo-brand-presence-drs-sns-migration.md

This is PR B of the geo-brand-presence DRS SNS migration. It unblocks the spacecat-audit-worker (PR D) to replace the existing HTTP call to POST /sites/{siteId}/brand-presence/analyze with a direct S3 + SNS pattern.

Cross-repo dependencies

PR Repo Description
PR A llmo-data-retrieval-service Extends fargate_trigger.py; deletes HTTP endpoint
PR B (this) spacecat-shared Adds uploadExcelToDrs() and publishBrandPresenceAnalyze() to DrsClient
PR C spacecat-infrastructure Provisions DRS_S3_BUCKET, DRS_SNS_TOPIC_ARN env vars and IAM grants
PR D #2410 spacecat-audit-worker Replaces Mystique callback with direct DRS integration — depends on this PR and PR C

Test plan

  • 71 tests passing, 100% line/statement/branch/function coverage
  • isS3Configured() — missing bucket, missing topicArn, both set
  • uploadExcelToDrs() — success (verifies bucket, key, ContentType, SSE), not configured, missing required fields, empty buffer, S3 error propagation
  • publishBrandPresenceAnalyze() — success (verifies TopicArn, full message shape, MessageAttributes), optional fields absent when not passed, not configured, missing required fields, SNS error propagation
  • createFrom() — reads DRS_S3_BUCKET and DRS_SNS_TOPIC_ARN from env

🤖 Generated with Claude Code

…alysis

Adds uploadExcelToDrs() and publishBrandPresenceAnalyze() to DrsClient to
support the geo-brand-presence DRS SNS migration (PR B). SpaceCat can now
upload Excel files directly to the DRS S3 bucket and trigger Fargate analysis
via SNS, replacing the previous HTTP endpoint call to DRS.

New env vars required (provisioned via infrastructure PR C):
- DRS_S3_BUCKET
- DRS_SNS_TOPIC_ARN

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@irenelagno
Copy link
Copy Markdown
Contributor Author

Cross-repo: geo-brand-presence DRS SNS migration

This PR is PR B of a four-repo migration. Related PRs:

PR Repo Description
A llmo-data-retrieval-service #1397 DRS: synthetic job support + bucket policy granting SpaceCat cross-account write
B (this) spacecat-shared spacecat-shared-drs-client v1.5.0 — uploadExcelToDrs + publishBrandPresenceAnalyze
C spacecat-infrastructure #475 Lambda role IAM permissions (S3 write + SNS publish)
D spacecat-audit-worker #2410 Refresh handler changes — depends on B published + C deployed

Rollout order: A, B, C can merge in any order. D cannot merge until B is published as an npm package and C is deployed to staging.

@github-actions
Copy link
Copy Markdown

This PR will trigger a minor release when merged.

irenelagno and others added 3 commits April 27, 2026 14:49
…bId return

- uploadExcelToDrs: change from destructured object to positional args
  (siteId, jobId, excelBuffer); key uses jobId directly, no brandSlug
- publishBrandPresenceAnalyze: change to (siteId, params) signature; generate
  jobId internally via randomUUID and return it; add web_search_provider,
  config_version, run_frequency to SNS metadata; move week/year to top-level
  SNS message to match DRS JobCompletedNotification.from_dict; rename
  metadata.site → metadata.site_id
- Remove UploadExcelParams interface; update PublishBrandPresenceParams and
  TypeScript declarations to match new signatures
- Update all tests to match new call patterns and SNS metadata assertions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@irenelagno irenelagno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: publishBrandPresenceAnalyze generates a new jobId internally, breaking the S3 key ↔ SNS job_id contract required by DRS

uploadExcelToDrs(siteId, jobId, buffer) builds the S3 key as external/spacecat/{siteId}/{jobId}/source.xlsx.
publishBrandPresenceAnalyze then generates its own fresh spacecat-{uuid} as job_id — so the SNS job_id never matches the S3 key the caller used for upload.

DRS PR #1484 (Fix #3) derives result_location from the SNS job_id: s3://{bucket}/external/spacecat/{siteId}/{job_id}/source.xlsx. With the current design the derived path doesn't exist — the file is at the uploadExcelToDrs-time jobId, not the SNS-time jobId.

Fix: accept jobId as an optional parameter, defaulting to a generated value only when not provided:

async publishBrandPresenceAnalyze(siteId, {
  jobId = `spacecat-${randomUUID()}`,
  resultLocation,
  webSearchProvider,
  configVersion,
  week,
  year,
  runFrequency,
  brand,
  imsOrgId,
} = {}) {

The caller (spacecat-audit-worker PR #2410) then generates jobId once per sheet, passes it to both uploadExcelToDrs and publishBrandPresenceAnalyze, ensuring S3 key == SNS job_id end-to-end.

Comment thread packages/spacecat-shared-drs-client/src/index.js Outdated
irenelagno and others added 2 commits April 29, 2026 13:14
Fixes the S3 key ↔ SNS job_id contract break: when the caller supplies
a jobId (i.e. the same one passed to uploadExcelToDrs) it is used in the
SNS message directly. A fresh spacecat-{uuid} is generated only when no
jobId is provided.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@irenelagno irenelagno merged commit 56b1266 into main Apr 30, 2026
5 checks passed
@irenelagno irenelagno deleted the feat/drs-client-s3-sns-brand-presence branch April 30, 2026 16:53
solaris007 pushed a commit that referenced this pull request Apr 30, 2026
# [@adobe/spacecat-shared-drs-client-v1.6.0](https://github.com/adobe/spacecat-shared/compare/@adobe/spacecat-shared-drs-client-v1.5.0...@adobe/spacecat-shared-drs-client-v1.6.0) (2026-04-30)

### Features

* **drs-client:** add S3 upload and SNS publish for brand presence analysis ([#1553](#1553)) ([56b1266](56b1266))
@solaris007
Copy link
Copy Markdown
Member

🎉 This PR is included in version @adobe/spacecat-shared-drs-client-v1.6.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

irenelagno added a commit to adobe/spacecat-audit-worker that referenced this pull request Apr 30, 2026
…2410)

## Summary

Migrates the geo-brand-presence refresh flow from deprecated
Mystique/SQS to direct DRS triggering via S3 cross-account upload + SNS
publish (`provider_id="external_spacecat"`). Removes all deprecated
cadence/detect/categorization handlers.

### Changes

**Deleted (8 files)**
-
`src/geo-brand-presence/{handler,detect-geo-brand-presence-handler,categorization-response-handler}.js`
— deprecated Mystique-based handlers
-
`src/geo-brand-presence-daily/{handler,detect-geo-brand-presence-handler}.js`
— deprecated daily Mystique handlers
- Corresponding test files

**Weekly refresh handler**
(`src/geo-brand-presence/geo-brand-presence-refresh-handler.js`)
- Removed Mystique SQS fallback entirely
- DRS not configured → `internalServerError` (hard fail instead of
fallback)
- Replaced `triggerBrandPresenceAnalyze` HTTP call with
`uploadExcelToDrs` (cross-account S3 PutObject) +
`publishBrandPresenceAnalyze` (SNS publish) with `runFrequency:
'weekly'`

**New: Daily refresh handler**
(`src/geo-brand-presence-daily/geo-brand-presence-refresh-handler.js`)
- Same logic as weekly, with `runFrequency: 'daily'`
- Registered as `refresh:geo-brand-presence-daily` in `src/index.js`

**`src/llmo-customer-analysis/handler.js`**
- Restored `triggerGeoBrandPresenceRefresh()` (removed in
d6487d8/211c0007)
- `hasBrandPresenceChanges` (topics/entities/categories) →
`drsClient.triggerBrandDetection(siteId)`
- `needsBrandPresenceRefresh` (brands/competitors +
`previousConfigVersion`) → SQS `geo-brand-presence-trigger-refresh`

**`src/index.js`**
- Removed 27+ deprecated handler entries
- Added `refresh:geo-brand-presence-daily`

**Cross-repo plan**:
`docs/plans/2026-04-22-geo-brand-presence-drs-sns-migration.md`

## Integration Pattern

```
SpaceCat refresh handler
  → read Excel from SharePoint
  → drsClient.uploadExcelToDrs()  →  s3://drs-bucket/external/spacecat/{siteId}/{jobId}/source.xlsx
  → drsClient.publishBrandPresenceAnalyze()  →  SNS { provider_id: "external_spacecat", result_location: s3://... }
  → DRS fargate_trigger (existing external_spacecat filter)
      → job not in DynamoDB → create synthetic job → launch Fargate
  → Fargate reads s3:// (ExternalPromptLoader, no changes needed)
  → brand analysis → distribution
```

## Dependencies (must land first)

| Repo | PR | Description |
|------|----|-------------|
| `adobe/spacecat-shared` |
[#1553](adobe/spacecat-shared#1553) |
`spacecat-shared-drs-client` v1.5.0 — `uploadExcelToDrs` +
`publishBrandPresenceAnalyze` — **must be published as npm package
before this PR merges** |
| `adobe/spacecat-infrastructure` |
[#475](adobe/spacecat-infrastructure#475) |
Lambda role: `s3:PutObject` on DRS bucket + `sns:Publish` on DRS topic —
**must be deployed to dev before end-to-end test** |
| `adobe-rnd/llmo-data-retrieval-service` |
[#1397](adobe-rnd/llmo-data-retrieval-service#1397)
| Fargate trigger: synthetic job creation for `external_spacecat`; SNS
topic cross-account publish policy |

## Test plan

### Tier 1 — Unit tests (no AWS, no network)
- [x] `npm test` passes with 100% coverage
- [x] `src/geo-brand-presence` — 100% coverage
- [x] `src/geo-brand-presence-daily` — 100% coverage
- [x] `src/llmo-customer-analysis/handler.js` — 100% coverage; 50 tests
pass

`DrsClient.uploadExcelToDrs` and `publishBrandPresenceAnalyze` are
mocked — validates handler logic and all error paths without AWS.

### Tier 2 — Integration smoke test against DRS eph stack

Test the handler end-to-end against the live DRS eph-1397 stack **before
merging**, without needing full SpaceCat infra. Requires AWS credentials
with access to the DRS dev account and `spacecat-shared-drs-client`
v1.5.0 installed locally.

**Set env vars pointing at the DRS eph stack:**
```bash
export DRS_S3_BUCKET=drs-v2-eph-1397-bp
export DRS_SNS_TOPIC_ARN=arn:aws:sns:us-east-1:489975610310:drs-v2-eph-1397-job-notifications
export AWS_PROFILE=drs-dev   # profile with access to DRS dev account
```

**Invoke the weekly handler directly with a test harness** (do not
commit):
```js
// test-harness.mjs
import { handler } from './src/geo-brand-presence/geo-brand-presence-refresh-handler.js';

await handler({
  auditContext: { siteId: '<real-spacecat-site-uuid>' },
  site: {
    getId: () => '<real-spacecat-site-uuid>',
    getOrganization: () => ({ getImsOrgId: () => '<ims-org-id>' }),
  },
});
```

**Verify in DRS:**
```bash
# DynamoDB — synthetic job must appear with provider_id=external_spacecat
aws --profile drs-dev dynamodb get-item \
  --table-name drs-v2-eph-1397-jobs \
  --key '{"job_id":{"S":"<job-id-from-handler-log>"}}' \
  --region us-east-1

# CloudWatch — Fargate task must launch
aws --profile drs-dev logs tail \
  /ecs/drs-v2-eph-1397-bp-fargate --follow --region us-east-1
```

Expected: Excel appears in S3 → SNS published → DRS synthetic job
created with `platform=chatgpt_paid` → Fargate launches brand analysis.

### Tier 3 — End-to-end staging validation (post-deploy)
- [ ] `DRS_S3_BUCKET` + `DRS_SNS_TOPIC_ARN` set in Secrets Manager
`/helix-deploy/spacecat-services/*/latest`
- [ ] Deploy to staging with PR A (DRS) + PR C (infra) already deployed
- [ ] Trigger `geo-brand-presence-trigger-refresh` for a real site →
verify: Excel in DRS S3 → SNS published → Fargate launched →
distribution completed

## Post-deployment steps

After merging and deploying to prod, the following must be configured in
AWS Secrets Manager under `/helix-deploy/spacecat-services/all`:

| Secret key | Description | Where to get the value |
|---|---|---|
| `DRS_S3_BUCKET` | DRS brand-presence S3 bucket name | DRS infra CDK
stack output (prod equivalent of dev's `drs-v2-main-bp`) |
| `DRS_SNS_TOPIC_ARN` | DRS notification SNS topic ARN | DRS infra CDK
stack output (e.g.
`arn:aws:sns:us-east-1:<prod-account>:drs-v2-prod-job-notifications`) |

> `DRS_API_URL` and `DRS_API_KEY` should already be present from the
existing weekly handler.

Also ensure the SpaceCat Lambda execution role in prod has the IAM
permissions from [spacecat-infrastructure PR
#475](adobe/spacecat-infrastructure#475):
- `s3:PutObject` on `arn:aws:s3:::${DRS_S3_BUCKET}/external/spacecat/*`
- `sns:Publish` on `${DRS_SNS_TOPIC_ARN}`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants