Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example for AWS S3 #123

Open
better-salmon opened this issue Nov 11, 2023 · 9 comments
Open

Add example for AWS S3 #123

better-salmon opened this issue Nov 11, 2023 · 9 comments
Labels
documentation Improvements or additions to documentation example Examples that enrich the documentation help wanted Extra attention is needed

Comments

@better-salmon
Copy link
Contributor

No description provided.

@better-salmon better-salmon added documentation Improvements or additions to documentation example Examples that enrich the documentation labels Nov 11, 2023
@better-salmon better-salmon added good first issue Good for newcomers help wanted Extra attention is needed and removed good first issue Good for newcomers labels Dec 10, 2023
@MauriceArikoglu
Copy link

#208 (comment)

Also applies here

@uncvrd
Copy link

uncvrd commented Dec 19, 2023

Hi everyone, wanted to contribute my AWS S3 implementation (I use DO Spaces, but should be similar for other S3 compatible services)

please feel free to contribute to improve upon this concept or highlight where I got it wrong, I just threw it together but seems to work. I use Next14 pages router so I don't think I can make use of the revalidateTag methods yet but they should be okay?

Thanks for creating this library :)

const { reviveFromBase64Representation, replaceJsonWithBase64 } = require("@neshca/json-replacer-reviver")
const { IncrementalCache, RevalidatedTags, Cache } = require("@neshca/cache-handler")
const createLruCache = require("@neshca/cache-handler/local-lru").default
const { S3, GetObjectCommand, PutObjectCommand, GetObjectCommandOutput } = require("@aws-sdk/client-s3")

const REVALIDATED_TAGS_KEY = "sharedRevalidatedTags"
const S3_BASE_PATH = "nextjs-cache"

const client = new S3({
    credentials: {
        accessKeyId: process.env.DO_SPACES_ACCESS_KEY_ID,
        secretAccessKey: process.env.DO_SPACES_SECRET_ACCESS_KEY,
    },
    endpoint: process.env.DO_SPACES_ENDPOINT,
    region: process.env.DO_SPACES_REGION,
    forcePathStyle: false,
})

const bucket = process.env.DO_SPACES_BUCKET

/**
 * @param {GetObjectCommandOutput["Body"]} body
 */
const fromBuffer = async (body) => {
    if (body) {
        const buffers = []

        for await (const data of body) {
            buffers.push(data)
        }

        return JSON.parse(Buffer.concat(buffers), reviveFromBase64Representation)
    }

    return null
}

/** @returns {Promise<RevalidatedTags>} */
const getTags = async () => {
    const command = new GetObjectCommand({
        Bucket: bucket,
        Key: `${S3_BASE_PATH}/${REVALIDATED_TAGS_KEY}`,
    })

    try {
        const response = await client.send(command)

        const sharedRevalidatedTags = await fromBuffer(response.Body)

        const entries = Object.entries(sharedRevalidatedTags)

        return Object.fromEntries(entries.map(([tag, revalidatedAt]) => [tag, Number(revalidatedAt)]))
    } catch (error) {
        // this handles the scenario where the object hasn't been added to the bucket yet. S3 throws an error if it cannot find the key
        return {}
    }
}

/**
 * @param {string} key
 * @param {any} value
 */
const putCommand = async (key, value) => {
    const command = new PutObjectCommand({
        Bucket: bucket,
        Key: `${S3_BASE_PATH}/${key}`,
        Body: JSON.stringify(value, replaceJsonWithBase64),
        ContentType: "application/json",
        ContentEncoding: "base64",
    })

    await client.send(command)
}

IncrementalCache.onCreation(async () => {
    // read more about TTL limitations https://caching-tools.github.io/next-shared-cache/configuration/ttl
    const useTtl = false

    /** @type {Cache} */
    const s3Cache = {
        async get(key) {
            try {
                const command = new GetObjectCommand({
                    Bucket: bucket,
                    Key: `${S3_BASE_PATH}/${key}`,
                })

                const response = await client.send(command)

                return await fromBuffer(response.Body)
            } catch (error) {
                return null
            }
        },
        async set(key, value, ttl) {
            try {
                await putCommand(key, value)
            } catch (error) {
            }
        },
        async getRevalidatedTags() {
            try {
                return await getTags()
            } catch (error) {
            }
        },
        async revalidateTag(tag, revalidatedAt) {
            try {
                const existingTags = await getTags()

                const newTags = {
                    ...existingTags,
                    [tag]: revalidatedAt,
                }

                await putCommand(REVALIDATED_TAGS_KEY, newTags)
            } catch (error) {
            }
        },
    }

    const localCache = createLruCache({
        useTtl,
    })

    return {
        cache: [s3Cache, localCache],
        useFileSystem: !useTtl,
    }
})

module.exports = IncrementalCache

@uncvrd
Copy link

uncvrd commented Jan 9, 2024

Hey @better-salmon ! Just wanted to report back with a question...

So I ran in to an issue when I build and deploy a new version of my code where existing cached pages first show an empty page and if I refresh it loads the web page correctly.

Given I think i understand how ISR works, I understand why this is:

  1. user tries to view a page, the ISR expiry date has passed
  2. app tries to pull page from cache, but the page has references to old build .js filenames and build ids
  3. app page is blank, but this page request triggers a new ISR to save to cache
  4. user refreshes, page is displayed

I am wondering how I should be handling this scenario since the initial page view will try to reference a page from cache with JS files that no longer can be referenced (since the JS file ids change on each build)

Do i need to add a deployment webhook that clears my cache in my S3 bucket on deployment, or did I mess something up in my config above?

Curious about your thoughts or how you handle this on Redis implementations.

For reference, I've exported the .json that is saved in my S3 bucket so you can see the diff.

Old === what is in the cache after a new build but without a refresh
New === what is in the cache after the page refresh

new.json
old.json

Thanks for your time!


EDIT: if you're curious how I'm doing SSG here is an example that I use on a page. I just default to dynamically generate all paths after deployment (hence the empty array)

export const getStaticPaths: GetStaticPaths = async () => {
    return {
        paths: [],
        fallback: "blocking",
    }
}

export const getStaticProps: GetStaticProps<Props, RouteParams> = async ({ params }) => {
    if (params) {
        const ssg = createServerSideHelpers({
            client: vtrpc,
        })

        try {
            await ssg.landingPage.byLinkName.prefetch({
                name: params.shortcode,
                site: params.site,
            })
        } catch (error) {}

        try {
            await ssg.landingPage.seo.prefetch({
                name: params.shortcode,
                site: params.site,
            })
        } catch (error) {}

        return {
            props: {
                id: params.shortcode,
                site: params.site,
                trpcState: ssg.dehydrate(),
            },
            revalidate: 60,
        }
    } else {
        return {
            redirect: {
                destination: "/",
                permanent: true,
            },
        }
    }
}

@better-salmon
Copy link
Contributor Author

Hi @uncvrd,

I apologize for the delay in getting back to you. I want to start by thanking you for your participation! Your example will be added to the documentation, but I need some time to test it.

Now, let me answer your questions. When deploying a new build of your application, it's important not to use the cache from the previous builds. To automate this process, I recommend using two methods simultaneously:

  1. Automatically delete your old ISR/Data cache when deploying a new build of your application.
  2. Use cache key prefixes to separate the old and new caches.

You can find instructions on using Build Id as a cache key prefix here: Build Id as a Prefix Key.

@uncvrd
Copy link

uncvrd commented Jan 15, 2024

Hey @better-salmon no worries - thanks for the response!

Thanks for the guidance on the build id, I'll work on implementing.

I really wanted to enforce a shorter bucket lifecycle policy to automatically delete items after a small period of time so that I wouldn't have to create a webhook to delete all cache on build. All my SSG pages have a TTL of 60 seconds and S3 doesn't allow a lifecycle policy less than a day.

I already use NATS in my infrastructure https://nats.io/ (it can be used like a KV similar to Redis) so I figured I'd build out a handler for that to test too. For this I've defined a lifecycle of 70 seconds (just a bit longer than my TTL for SSG) and it seems to be working great with minimal downtime between builds since the TTL defined on my "kv cache" is so small

Anyways, I took most of your inner workings for Redis and built this out:

// cache-handler/index.js

const { IncrementalCache } = require("@neshca/cache-handler")
const createLruCache = require("@neshca/cache-handler/local-lru").default
const { connect } = require("nats")
const createNatsCache = require("./nats-cache")

IncrementalCache.onCreation(async () => {
    // read more about TTL limitations https://caching-tools.github.io/next-shared-cache/configuration/ttl
    const useTtl = false

    const client = await connect({
        servers: process.env.NATS_INTERNAL_HREF,
        user: "admin",
        pass: process.env.NATS_ADMIN_PASS,
    })

    const natsCache = await createNatsCache({
        client,
        useTtl,
        // timeout for the Redis client operations like `get` and `set`
        // after this timeout, the operation will be considered failed and the `localCache` will be used
        timeoutMs: 5000,
    })

    const localCache = createLruCache({
        useTtl,
    })

    return {
        cache: [natsCache, localCache],
        useFileSystem: !useTtl,
    }
})

module.exports = IncrementalCache

// cache-handler/nats-cache.js

const { JSONCodec } = require("nats")
const { Cache } = require("@neshca/cache-handler")

/**
 *
 * @param {Promise<any>} operation
 * @param {number} timeoutMs
 * @returns {Promise<any>}
 */
function withTimeout(operation, timeoutMs) {
    if (typeof timeoutMs !== "number" || isNaN(timeoutMs) || timeoutMs <= 0) {
        return operation
    }

    return new Promise((resolve, reject) => {
        const timeoutHandle = setTimeout(reject, timeoutMs, new Error(`Operation timed out after ${timeoutMs} ms`))

        operation
            .then((result) => {
                clearTimeout(timeoutHandle)
                resolve(result)
            })
            .catch((error) => {
                clearTimeout(timeoutHandle)
                reject(error)
            })
    })
}

const getTags = async (key, timeoutMs) => {
    try {
        const getRevalidatedTags = kv.get(key)

        const value = await withTimeout(getRevalidatedTags, timeoutMs)

        return value?.json() ?? {}
    } catch (error) {
        return {}
    }
}

/**
 * @param {NatsCacheHandlerOptions}
 * @returns {Promise<Cache>}
 */
async function createNatsCache({ client, keyPrefix = "", revalidatedTagsKey = "__sharedRevalidatedTags__", useTtl = false, timeoutMs }) {
    const js = client.jetstream()
    const kv = await js.views.kv(`nextjs-cache`, {
        ttl: 70 * 1000, // 70 seconds
    })

    // NOTE: i dont create the revalidatedTags object here

    return {
        // name: "nats-stack",
        async get(key) {
            try {
                const getCacheValue = kv.get(keyPrefix + key)

                const value = await withTimeout(getCacheValue, timeoutMs)

                const cacheValue = value?.json?.() ?? null

                if (cacheValue?.value?.kind === "ROUTE") {
                    cacheValue.value.body = Buffer.from(cacheValue.value.body, "base64")
                }

                return cacheValue
            } catch (error) {
                return null
            }
        },
        async set(key, cacheValue, ttl) {
            try {
                let preparedCacheValue = cacheValue

                if (cacheValue.value?.kind === "ROUTE") {
                    preparedCacheValue = structuredClone(cacheValue)
                    // @ts-expect-error -- object must have the same shape as cacheValue
                    preparedCacheValue.value.body = cacheValue.value.body.toString("base64")
                }

                const setCacheValue = kv.put(keyPrefix + key, JSONCodec().encode(preparedCacheValue))

                await withTimeout(setCacheValue, timeoutMs)
            } catch (error) {}
        },
        async getRevalidatedTags() {
            try {
                return await getTags(keyPrefix + revalidatedTagsKey, timeoutMs)
            } catch (error) {}
        },
        async revalidateTag(tag, revalidatedAt) {
            try {
                const sharedRevalidatedTags = await getTags(keyPrefix + revalidatedTagsKey, timeoutMs)

                const newTags = {
                    ...sharedRevalidatedTags,
                    [tag]: revalidatedAt,
                }

                const setRevalidatedTags = kv.put(keyPrefix + revalidatedTagsKey, JSONCodec().encode(newTags))

                await withTimeout(setRevalidatedTags, timeoutMs)
            } catch (error) {}
        },
    }
}

module.exports = createNatsCache

Do note, that I dont create the revalidatedKeys object when initializing this like you do for Redis, that happens in the getTags function if that key cannot be found since this object also expires every 70 seconds too

Anyways I hope this example is useful too!

@better-salmon
Copy link
Contributor Author

Thank you for your attention and contribution!

I noticed one thing in your code that can be improved. You do not need to use the try/catch inside the Cache methods. Moreover, you must throw an error to pass the work to the subsequent Handler.

I am working on a new documentation section that will sufficiently cover these nuances.

@uncvrd
Copy link

uncvrd commented Jan 27, 2024

Gotcha - thanks for the feedback! I'll check the docs regarding error handling 🙏

@dross15
Copy link

dross15 commented May 15, 2024

Hi @better-salmon! I'm trying to find where in the docs you write about the nuances of using try/catch? Was this ever added?

If not, is it the still the case you shouldn't use try/catch within the methods? And if we don't use try/catch how do we handle 404 scenarios? Thank you!

@better-salmon
Copy link
Contributor Author

Hello @dross15! It is mentioned here https://caching-tools.github.io/next-shared-cache/api-reference/handler#overview and also in the Guide of creating a custom Handler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation example Examples that enrich the documentation help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants