Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monthly activity email cron job #4400

Merged
merged 7 commits into from
Apr 29, 2024
Merged

Monthly activity email cron job #4400

merged 7 commits into from
Apr 29, 2024

Conversation

Vinnl
Copy link
Collaborator

@Vinnl Vinnl commented Apr 8, 2024

References:

Jira: MNTOR-3067
Figma: N/A

Description

This adds a MonthlyActivityEmail flag and a script that will send the monthly activity email to 10 (or $MONTHLY_ACTIVITY_EMAIL_BATCH_SIZE) Plus subscribers who have that flag enabled, whose monthly_email_optout is not true (@mansaj - is this column actually boolean, or is this the one where I need to check for 0 or 1 or -1 or something) and to whom we've not sent such an email in the last 30 days.

Importantly, it also sets up the build infrastructure to compile JSX and TypeScript for cron job scripts (though note that I haven't updated our cron jobs configuration yet to call the script - which will be at dist/cronjobs/monthlyActivity.js). The main thing here to realise is that this means that some code is now built by Next.js (using Webpack and SWC) when shipped as part of the website, and by ESBuild when shipped as part of the cron job - and that these have their own peculiarities. For example, JSX files that are also used in cron jobs need to explicitly import React, need to use a wrapper around import "server-only", need to load Fluent differently (see also my followup PR), and don't have access to variables from .env (possibly we should run dotenv for cron jobs like we do for the e2e tests).

I'll call out some more specific things inline.

How to test

Enable the MonthlyActivityEmail flag for your user, then run npm run dev:cronjobs. You may want to reset the value of their monthly_email_at column afterwards if you want to retry.

Checklist (Definition of Done)

  • Localization strings (if needed) have been added.
  • Commits in this PR are minimal and have descriptive commit messages.
  • I've added or updated the relevant sections in readme and/or code comments
  • I've added a unit test to test for potential regressions of this bug. - Somewhat, but there'd also be lots of mocking involved for more.
  • Product Owner accepted the User Story (demo of functionality completed) or waived the privilege.
  • All acceptance criteria are met. - Still need to actually have the cron job executed.
  • Jira ticket has been updated (if needed) to match changes made during the development process.
  • Jira ticket has been updated (if needed) with suggestions for QA when this PR is deployed to stage.

@Vinnl Vinnl requested review from rhelmer and mansaj April 8, 2024 19:55
@Vinnl Vinnl requested a review from flodolo as a code owner April 8, 2024 19:55
Copy link
Collaborator Author

@Vinnl Vinnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Called out some things to pay attention to.

format: "esm",
outdir: "dist/cronjobs",
sourcemap: true,
target: "node20.12",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhelmer Here's another instance of the Node version we'd need to keep up-to-date.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Should we start some document/wiki/README in this repo with all the places we need to bump Node.js versions (considering last week's sec releases and this week's scheduled sequel)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be just node or node20 if you want to be more specific - do you think it's worth specifying it here? The environment running the cron job (k8s CronJob using the container we build in Dockerfile) will be what's actually available in the environment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I like about saying 20.12 is that it will show up on a project-wide search for that number, so that we won't forget to update it when updating the other instances. Using just 20 wouldn't be a problem for the coming future, but makes it easy to miss when we upgrade to Node 22.

As for the document/wiki/README, I'd actually like to write a small script that just checks all instances and errors out if they're not aligned, that we can run in CI.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to move this into a separate file (de1c3d1) because the other functions in the module referenced the Next.js-specific l10n module, which calls require.context, which isn't available in plain Node.

* @param params.countryCode
*/
export async function getSubscriberBreaches({
fxaUid,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to fxaUid from user because that was the only property we used, and user was the user's session - which there is none of when running cronjobs.

Other than that I did not change this function.

@@ -102,7 +102,7 @@ export async function triggerMonthlyActivity(emailAddress: string) {
const data = getDashboardSummary(
latestScan.results,
await getSubscriberBreaches({
user: session.user,
fxaUid: session.user.subscriber?.fxa_uid,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment at the function definition of getSubscriberBreaches for why I changed this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is needed because, while import "server-only" prevents us from loading a module in a client component, it also breaks when running in plain Node.

process.env.MONTHLY_ACTIVITY_EMAIL_BATCH_SIZE ?? "",
10,
);
const batchSize = Number.isNaN(batchSizeEnvVar) ? 10 : batchSizeEnvVar;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured I'd start with (configurable) small batches so that we can just run the cron job more often and not overload the server with a single big one, since we're doing multiple queries per user. Since this is limited to Plus users for now, there's not that many accounts we need to get through anyway. Also makes it easier to recover from errors.

fxaUid: subscriber.fxa_uid,
countryCode: countryCodeGuess,
});
const data = getDashboardSummary(latestScan.results, subscriberBreaches);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have to check with Tony whether we want to show totals, or only specifics for the past month.

Comment on lines 309 to 308
query = query.andWhereRaw(
`EXISTS (
SELECT 1
FROM jsonb_array_elements(fxa_profile_json->'subscriptions') AS subscription
WHERE subscription::text = '"${MONITOR_PREMIUM_CAPABILITY}"'
)`
);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know pretty little of SQL, so I'm not sure if this is the best way to check for rows where the fxa_profile_json column has an object with a subscriptions property containing an array of strings, of which one element is "monitor"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's any more efficient but there is a ? operator https://www.postgresql.org/docs/9.4/functions-json.html that can check for existence:

WHERE (fxa_profile_json->'subscriptions')::jsonb ? 'monitor'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think I did try to use that, but it got interpreted as a parameter - and then my PG UI didn't support it either. But you just gave me the push to verify that running it in psql does work, and then digging through old Knex.js issues to find out how to escape it: ef2a47c

Copy link
Collaborator Author

@Vinnl Vinnl Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably best to ignore this file for now - it's basically the existing Next.js getL10n, but with changes to make it compatible with plain Node. I've got a followup PR that untangles the five different runtimes that l10n currently has to take into account, and abstracts away the common parts.

@@ -2,6 +2,7 @@
* License, v. 2.0. If a copy of the MPL was not distributed with this
* file, You can obtain one at http://mozilla.org/MPL/2.0/. */

import React from "react";
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next.js auto-imports React in JSX files, but esbuild doesn't (at least no by default, maybe there's an option somewhere).

@Vinnl Vinnl mentioned this pull request Apr 8, 2024
@Vinnl Vinnl self-assigned this Apr 8, 2024
@Vinnl Vinnl added the Review: M Code review time: 1-2 hours label Apr 8, 2024
esbuild.cronjobs.mjs Outdated Show resolved Hide resolved
format: "esm",
outdir: "dist/cronjobs",
sourcemap: true,
target: "node20.12",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Should we start some document/wiki/README in this repo with all the places we need to bump Node.js versions (considering last week's sec releases and this week's scheduled sequel)?


async function run() {
const batchSizeEnvVar = Number.parseInt(
process.env.MONTHLY_ACTIVITY_EMAIL_BATCH_SIZE ?? "",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Why nullish coalesce to empty string which will NaN and then default to 10 below on L28?
Could we default here to "10" and skip L28, or am I missing something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so I was changing it to your suggestion, but then remembered why I didn't do that in the first place - I did still want to guard against passing in an invalid value for the env var, i.e. check Number.isNaN. But that does mean we'd ignore the env var value silently, so I'm now just throwing if it's wrong: 32e5b54.

src/db/tables/subscribers.js Show resolved Hide resolved
Copy link
Collaborator

@rhelmer rhelmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, I have some quibbles that are pretty minor. I know it's safe as-is but I really think we should always use parameterized queries for knex queries.

format: "esm",
outdir: "dist/cronjobs",
sourcemap: true,
target: "node20.12",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be just node or node20 if you want to be more specific - do you think it's worth specifying it here? The environment running the cron job (k8s CronJob using the container we build in Dockerfile) will be what's actually available in the environment.

src/cronjobs/monthlyActivity.tsx Outdated Show resolved Hide resolved
@@ -8,6 +8,7 @@ import AppConstants from '../../appConstants.js'

const knex = createDbConnection();
const { DELETE_UNVERIFIED_SUBSCRIBERS_TIMER } = AppConstants
const MONITOR_PREMIUM_CAPABILITY = "monitor";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in constants.ts, since it's used in more than one place?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, I was actually going to ask you, since I copied this pattern from you 😅

Comment on lines 306 to 308
// Note: This will only match people who had a Plus subscription the last
// time they signed in. We'd have to check with SubPlat directly to
// find out if they have a subscription right now.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Monitor database is updated by webhook as well, so we don't necessarily need to wait until the user logs in again.

This comment is relevant for dev environments that don't have webhook handling set up e.g. localhost.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, good one - clarified the comment in 9c81d47.

Comment on lines 309 to 308
query = query.andWhereRaw(
`EXISTS (
SELECT 1
FROM jsonb_array_elements(fxa_profile_json->'subscriptions') AS subscription
WHERE subscription::text = '"${MONITOR_PREMIUM_CAPABILITY}"'
)`
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's any more efficient but there is a ? operator https://www.postgresql.org/docs/9.4/functions-json.html that can check for existence:

WHERE (fxa_profile_json->'subscriptions')::jsonb ? 'monitor'

src/db/tables/subscribers.js Outdated Show resolved Hide resolved
let query = knex('subscribers')
.select()
.where((builder) => builder.whereNull("monthly_email_optout").orWhere("monthly_email_optout", false))
.andWhere(builder => builder.whereNull("monthly_email_at").orWhereRaw('"monthly_email_at" < NOW() - INTERVAL \'30 days\''));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this avoids re-sending because updateMonthlyEmailTimestamp() is called which sets this, right? I just want to make sure we're thinking through that path if there are any exceptions, so we can't e.g. send duplicate emails because something there fails after the email is sent but before this timestamp is updated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly, and that gets set before we send the email, so if an error occurs, the email won't get sent. Although given #4399 (comment), I'll probably need to create a new query once that lands.

Copy link
Collaborator

@mansaj mansaj Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking through this more, what if we:

  1. drop the columns (and data) of monthly_email_optout and monthly_email_at
  2. add new cols monthly_monitor_report: boolean and monthly_monitor_report_at:timestamp

Basically do a reset since we don't want to keep the old data

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I believe the data that exists should be inconsequential (very outdated, at least 6-12 months old)

Copy link
Collaborator Author

@Vinnl Vinnl Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that works for me. Wouldn't really change the logic here, just different column names, and a new query to update the timestamp when I send an email. (Although I imagine we do want to keep existing opt-outs of monthly_email_optout.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, rebased on latest main that has the changes described above, and added e34b2e5 to make use of them.

@Vinnl Vinnl mentioned this pull request Apr 9, 2024
@@ -0,0 +1,108 @@
/* This Source Code Form is subject to the terms of the Mozilla Public
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use src/scripts/cronjobs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With apologies for rebasing, but I was afraid I wasn't going to be able to get all the path changes correct if I spread this over fixup commits. Fixed in f753a0f and 93c298a.

// show up as errors when we remove it from the flag list:
/** @type {import("./featureFlags.js").FeatureFlagName} */
const featureFlagName = "MonthlyActivityEmail";
const flag = await knex("feature_flags")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we use a feature flag util function for this? If this specific one doesn't exist, maybe we should create it in the util file?

Also, for example, we need to make sure delete_at is null, since we don't actually delete old feature flags but rather archive them

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's a good point, added that check in a util function in 1c425f7.

The main reason I initially shied away from the util function, is because I had to know both is_enabled and the allow_list, and thus was already replicating feature flag-specific logic. I've now created a function that does a check for the deleted_at, but otherwise just returns the full row. We'll probably replace it by experiments later on anyway.

updated_at: knex.fn.now(),
})
.where("primary_email", subscriber.primary_email)
.andWhere("id", subscriber.id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary to match both primary_email and id? Is there a case where these two won't match?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is me being overly cautious after our security issues where people were able to have us email other people by adding their email address - by matching on ID in addition to email address, if we're somehow looking at someone else with the same email address, it won't match. But matching on just the ID should be sufficient, of course. There's no practical downside to matching on both though, is that correct?

/**
* @param {string} email
* @deprecated Only used by the `send-email-to-unresolved-breach-subscribers.js`, which it looks like might not be sent anymore?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @deprecated Only used by the `send-email-to-unresolved-breach-subscribers.js`, which it looks like might not be sent anymore?
* @deprecated Delete as a part of MNTOR-3077

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -296,6 +349,7 @@ async function updateMonthlyEmailTimestamp (email) {
* Unsubscribe user from monthly unresolved breach emails
*
* @param {string} token User verification token
* @deprecated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @deprecated
* @deprecated Delete as a part of MNTOR-3077

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 43 to 48
// const availableLocales = Object.keys(bundleSources);
// const filteredLocales =
// process.env.APP_ENV === "heroku"
// ? availableLocales
// : availableLocales.filter((locale) => supportedLocales?.includes(locale));
// const currentLocales = negotiateLanguages(languages, filteredLocales, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// const availableLocales = Object.keys(bundleSources);
// const filteredLocales =
// process.env.APP_ENV === "heroku"
// ? availableLocales
// : availableLocales.filter((locale) => supportedLocales?.includes(locale));
// const currentLocales = negotiateLanguages(languages, filteredLocales, {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the whole file in #4401, but might as well fix it for a cleaner history :) 49c09b4

That's the only property that's actually used, and this allows
callers to also obtain it from elsewhere. This will especially be
relevant when sending emails, when there's no user session, and
thus the UID will have to be fetched from the database.
This will allow us to import that function without also importing
the other modules imported in that file. This will be extra
relevant for cron jobs, which don't run Webpack and thus will trip
over `require.context` being present in l10n modules that get
transitively included for one of the other functions.
I'll do a pass over the three different l10n loading strategies we
have right now and extract the common bits, to hopefully make it a
bit more maintainable. Gotta love five different runtimes!

(Yes, that's not an exaggeration: plain Node (for emails), server
components, client components, Storybook, and Jest.)
This this is running in plain Node (i.e. not with the Next.js
infrastructure), I had to make a couple of adjustments to some
existing files. Specifically, files with JSX code need to explicitly
import React, and code that shouldn't be run on the client-side
can't use `import "server-only"`, because that only checks that
it's being imported in a React server component.

Additionally, I've added esbuild to bundle it up into a single
file. Theoretically, this isn't really needed for Node.js scripts,
since there are no HTTP requests we're trying to optimise. However,
plain Node.js can't resolve module imports without file extensions
(e.g. `"../db/tables/subscribers"`), so all import specifiers in
*any* module that gets loaded by the cron jobs would need to use
file extensions that match those of the generated files (i.e. .js
rather than .ts), which is bound to cause its own issues in
Next.js. Thus, esbuild is a compromise that can resolve these
import specifiers for us.
This uses `tsx` ("TS execute") to strip the type annotations on the
fly and then run the resulting output in Node. It uses ESBuild
behind the scenes, so the results should be consistent with the
built JS in production.
@Vinnl Vinnl merged commit 56c50e9 into main Apr 29, 2024
14 checks passed
@Vinnl Vinnl deleted the MNTOR-3067-monthly-cron branch April 29, 2024 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Review: M Code review time: 1-2 hours
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants