Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code reloads on every request in 6.10.0 #1353

Closed
dgobaud opened this issue May 31, 2019 · 49 comments · Fixed by #1733
Closed

code reloads on every request in 6.10.0 #1353

dgobaud opened this issue May 31, 2019 · 49 comments · Fixed by #1733

Comments

@dgobaud
Copy link

dgobaud commented May 31, 2019

Every HTTP request seems to result in a full code reload which causes us problems.

Eg we decrypt config at load through KMS. If it happens on every request it'll run up the KMS bill.

@samtstern
Copy link
Contributor

@dgobaud thanks for raising this issue! That's the intended behavior right now, but we didn't realize some people would not want that.

Can you show an example of your code?

@dgobaud
Copy link
Author

dgobaud commented May 31, 2019

Yes please see below. It should only reload on code changes (hot reload). Why reload everytime? That is slow and doesn't mirror production behavior right?

// Assume this is costly
const promiseLoadConfig = loadConfig();

app.use((_req, _res, next) => {
  (async () => {
    await promiseLoadConfig;

    next();
  })();
});

app.use("/auth", auth.createAuthRoute(firebaseApp));

export const v1 = functions.https.onRequest(app);

@abeisgoat
Copy link
Contributor

The act of hot-reloading on each invocation is designed to mimic the unpredictable behavior of the true GCF environment (your code may reload on each request, or it may not, we make no guarantees). The more common situation which developers hit is that they rely on some global state and as a result get hurt when they have more function invocations (which spawn n unrelated GCF instances with separate global state for each) By forcing hot-reloading we are basically providing the worst-case scenario for you and saying if your code is resilient to this environment, it'll be resilient in production.

That's how it's currently designed - is that the right decision? It's debatable. I think it would be hard to convince me that persisting the environment between invocations is ideal because, as I said, I'd rather encourage developers to create more resilient functions. Truthfully, I feel that this is probably doing it's job and highlighting an issue for you, if you're concerned about API invocations locally then in production you'd still potentially (but again we make no promises) have this issue, so it's better to design around this issue in your code then hope the function containers are long-lived enough to avoid this.

However, what we should definitely offer (and currently don't) is an easy way to check if you're in the emulator (like process.env.IS_FIREBASE_EMULATOR so you can use locally supplied values instead of production services. This would at least provide a work-around locally if you're confident that you want this behavior in prod so you can do something different locally.

@dgobaud
Copy link
Author

dgobaud commented Jun 1, 2019

According to the documentation Function instance lifespan the typical behavior is function instances are reused and it recommends caching state in the global scope which makes sense in order to minimize cold starts and maximize performance. Amazon Lambda broadly works the same way for seemingly the same performance reason https://docs.aws.amazon.com/lambda/latest/dg/running-lambda-code.html

I think it would make more sense to have an option --worst-case-performance or something in order to emulate the worst case to make sure code is robust and can handle it but it should not be the default. Having the code auto-reload on code changes though is a nice feature and I think older versions maybe already did that?

Perhaps there is also a better way to do what we are doing. We are using KMS to decrypt secrets that are stored encrypted in code following the recommendation "Storing secrets in code, encrypted with a key from Cloud KMS.".

So loadConfig() runs at start, decrypts environment variables, and loads them into globally scoped config.

The code is robust and works with it reloading every time but that is going to run up the KMS bill. We could optimize the code by storing a single encrypted blob of keys vs. each key individually which would reduce KMS calls to 1 per load but it is harder to manager and will still run up the bill.

"The environment running a function instance is typically resilient and reused by subsequent function invocations, unless the number of instances is being scaled down (due to lack of ongoing traffic), or your function crashes. This means that when one function execution ends, another function invocation can be handled by the same function instance. Therefore, it is recommended to cache state across invocations in global scope where possible. Your function should be still prepared to work without this cache available as there is no guarantee that the next invocation will reach the same function instance (see Stateless functions)."

@samtstern
Copy link
Contributor

@dgobaud thanks for such a detailed response! I think you're right that we need to introduce some nuance in the behavior.

As far as your current problem ... could you lazy-load the secrets the first time you need them and then keep them cached in the global scope from there on? Would that help at all in your situation?

@abeisgoat
Copy link
Contributor

@dgobaud Thanks for the thoughts!

I will have to reflect on this a bit and determine if I think emulating something beyond "worse case" is worth the technical trade-off of comprising our existing, fairly succinct runtime structure since this would add a lot of complexity in our tests and introduce an entire layer of new potential bugs. I agree it's useful, but we need to balance maintainability of this already complex system.

If anyone else is reading this has thoughts, please chime in!

@dgobaud
Copy link
Author

dgobaud commented Jun 3, 2019

@samtstern great! I hope you change it soon because right now we are stuck on firebase-tools 6.8.0.

I believe that is how we have done it. loadConfig() is called once in our main index.ts and it caches config.

Each request does await promiseLoadConfig; which is resolved when loadConfig() finishes which is only called once.

Then the request continues via next() and the request handlers all simply do import { config } from "../config";

// index.ts

const firestoreSettings = { timestampsInSnapshots: true };
admin.firestore().settings(firestoreSettings);

// Configure routes
const app = express();
app.use(cors({ origin: true }));

loadConfig();

app.use((_req, _res, next) => {
  (async () => {
    const loadConfigSuccess = await promiseLoadConfig;

    if (!loadConfigSuccess) {
      console.error("loadConfig FAILED - exiting");

      process.exit(1);
    } else {
      next();
    }
  })();
});

app.use("/auth", auth.createAuthRoute(firebaseApp));

export const v1 = functions.https.onRequest(app);

// config.ts

let loadConfigResolve: (value?: boolean | PromiseLike<boolean>) => void;

export const promiseLoadConfig: Promise<boolean> = new Promise<boolean>(
  resolve => {
    loadConfigResolve = resolve;
  }
);

export const config: SettingsType = {};

export const loadConfig = async () => {
  try {
   // do stuff to load config varriable

    loadConfigResolve(true);

    return true;
  } catch (exception) {
    console.error(`[CONFIG MANAGER] ERROR - ${exception.message}`);
    console.error(exception);

    loadConfigResolve(false);

    return false;
  }
};

@dgobaud
Copy link
Author

dgobaud commented Jun 3, 2019

@samtstern and the problem is this doesn't work - it seems global scope is reloaded on every HTTP request. It seems to create an entirely new instance or whatever you call it.

@dgobaud
Copy link
Author

dgobaud commented Jun 4, 2019

@abeisgoat I think you're saying this new functionality where it reloads everything is useful but hard to maintain and you're considering reverting to the old way where it only reloads on source code changes? If so that would be great - I don't think this reload everything on every request is beyond "worse case" - it is the worst case.

Is the issue your test code right now has been changed for this functionality so if you revert it you need to revert the test code, then if you add --worst-case option you'd have to also copy the current test code over and maintain it in addition to the default mode?

@dgobaud
Copy link
Author

dgobaud commented Jun 5, 2019

@samtstern @abeisgoat any update on this? thanks

@samtstern
Copy link
Contributor

@dgobaud we have not yet decided what we want the behavior here to be, we are discussing it. If this is blocking you please downgrade to 6.8.0 as I don't think there will be a change here for at least a few weeks. I hope you can understand that this is a very complex decision and we want to make it only once.

@dgobaud
Copy link
Author

dgobaud commented Jun 5, 2019

@samtstern got it I also just wanted to make sure I understand @abeisgoat's current thinking and what is the "beyond worst case" emulation idea/option. I do hope you revert or add the --worst-case option for testing because I don't think the current functionality is really practical/usable in development.

@abeisgoat
Copy link
Contributor

@dgobaud what I meant was that in theory, we could shift over the structure of the emulator to preserve state between requests. We've had some more discussions about this internally since my first comment and I think I'm still leaning towards the current implementation of providing only the worse-case scenario. That being said, we just merged in a patch which adds process.env.FUNCTIONS_EMULATOR so you can stub out your expensive API calls when in the emulator.

We could add in a flag which changed code to hot-reload only on file-system changes, but I think this would add a lot of complexity and room for error in the code.

The emulator before 6.9.0 was a totally unrelated legacy code base, which is why changes were made here and we shifted to a simpler more succinct invocation model. We need to make sure that the emulator balances utility and maintainability since we need to make sure different engineers can maintain it over time.

Anyway, with the new env variable (which will come out next week with the 7.0.0 release), you should be able to avoid hitting the API when running the code in the emulator.

We're still not sure what path we'll take with this and I'll leave this bug open for awhile to see if other developers have similar complaints above the code-loading mechanism, if it's a common issue we'll reevaluate.

@dgobaud
Copy link
Author

dgobaud commented Jun 7, 2019

Ok thanks for the update - process.env.FUNCTIONS_EMULATOR will at least provide a way to practically run the code locally, albeit a way that adds complexity on the end developer (us) side through special casing the emulator.

I do see how changing the code to only reload on file changes is more complicated than reloading every time.

If you do implement the file change monitoring code, I assume you can use the current invocation code so maintaining what you already have written (invocation code) won't be more work/harder.

But you will need to maintain the new file change monitoring code which does add complexity at the benefit of more realistically emulating the production environment and adhering to the documentation about global state caching.

And I think you could easily add the --worst-case option that just makes it run as it currently does without file change monitoring.

@abeisgoat
Copy link
Contributor

Thanks for all your thoughts - they're super important for deciding what direction to take the emulator.

I'm gonna keep thinking on this and see if we can't get the best of all worlds. As I mentioned in #1360 the current structure also makes using interactive debuggers hard because the runtime PID is changing all the time. This frustrates me and makes it more likely that I'll shift over the runtime model to support both these use-cases, but we'll see.

@dgobaud
Copy link
Author

dgobaud commented Jun 10, 2019

@abeisgoat interesting - I haven't tried debugging but can see how breakpoints breaking on every HTTP request could make debugging very annoying... seems the new emulator launches a new process for every request? Guessing there isn't really a way to make breakpoints persist in that model since I bet they get hooked up on process launch?

@ChromeQ
Copy link

ChromeQ commented Jul 16, 2019

Have there been more discussions on this and are we leaning towards a solution?

I too would like to see an option provided to emulate the hot-reload or cold-start as I was sure this used to happen and now my caching is broken in dev.
I checked the docs and it still recommends to use global vars and with a cold start on each invocation makes it hard to make sure my "caching" is working in dev.
Perhaps a --cold-start option is more descriptive than --worst-case?

@samtstern
Copy link
Contributor

@ChromeQ we talk about this frequently but have still not arrived at a solution we're happy with. There are two things at play here. The first is what sort of options we want to expose to the developer, which is the relatively easy half of the question. The second part is the technical implementation, which is much harder. Currently the whole emulator is built around the process-per-invocation assumption and to create long-lived functions processes that can survive multiple invocations will be an extreme re-write.

We are willing to do this work, but I just want to set expectations that it could take a while longer. We really appreciate the feedback!

@ChromeQ
Copy link

ChromeQ commented Jul 16, 2019

Thanks for the quick reply. I appreciate it is probably much tougher than it sounds, especially given that behaviour is how it used to work back in v6.8.0 and there has been a lot of changes since then.
Happy to wait and see how this progresses.

@dgobaud
Copy link
Author

dgobaud commented Jul 16, 2019

@samstern I see I guess there was a big change with how 6.8.0 worked? We're still using 6.8.0 but starting to notice things that need the newer version like I think Firestore rule testing which is important.

@dgobaud
Copy link
Author

dgobaud commented Aug 28, 2019

@samtstern any update on this please?

@samtstern
Copy link
Contributor

@dgobaud nothing new to share. Right now we are focused on getting some use-cases feature complete, namely getting full SDK and emulator support for all RTDB/Firestore + Functions use cases. Then once we are feature complete and stable there we will work on re-architecting the functions runtime to have better support for hot reloads and debugging.

@Dean-NC
Copy link

Dean-NC commented Sep 4, 2019

I'm new to firebase, and I spent some time trying to figure out why I had this problem...it seemed that node module caching wasn't working, during emulation. I had not deployed the function yet. After deployment, I saw that all worked as expected. After reading the comments here, it seems this is just an emulation limitation/feature.

@lookfirst
Copy link

I came here because I'm making a connection to postgres. In development, my functions hit the instance running on my local machine. In production, they hit the cloud sql postgres instance. Creating the connection on every request locally is not the end fo the world, but kind of slow. I totally get why things are the way they are, but it would be nice to also escape out of that.

Not being able to test lazy invocation of globals is a small bummer. It would at least be nice to document the fact that the emulator runs index.ts every time.

One idea I didn't see proposed in this thread would be to somehow create an api for global variables. Something like this:

src/index.ts:

import {globals} from 'firebase-functions';

if (!globals.contains('connection')) {
    globals.set('connection', async () => await makeConnection());
}

But I'm not sure how that works internally in the emulator with module reloading. Someone smarter than I gets to figure that one out. =)

@dgobaud
Copy link
Author

dgobaud commented Sep 20, 2019

@abeisgoat I'm still on 6.8.0 because of this just last night started having a problem deploying I contacted support they said to upgrade but we can't I wonder is 6.8.0 now unable to deploy? It is becoming a problem not being able to upgrade :(

[debug] [2019-09-20T10:07:08.934Z] ----------------------------------------------------------------------
[debug] [2019-09-20T10:07:08.936Z] Command: /Users/USER/.nvm/versions/node/v8.15.0/bin/node /Users/USER/.nvm/versions/node/v8.15.0/bin/firebase deploy --only storage,hosting,firestore:rules,firestore:indexes,functions -P PROJECT
[debug] [2019-09-20T10:07:08.936Z] CLI Version: 6.8.0
[debug] [2019-09-20T10:07:08.936Z] Platform: darwin
[debug] [2019-09-20T10:07:08.936Z] Node Version: v8.15.0
[debug] [2019-09-20T10:07:08.937Z] Time: Fri Sep 20 2019 03:07:08 GMT-0700 (PDT)
[debug] [2019-09-20T10:07:08.937Z] ----------------------------------------------------------------------
[debug]
[debug] [2019-09-20T10:07:08.945Z] > command requires scopes: ["email","openid","https://www.googleapis.com/auth/cloudplatformprojects.readonly","https://www.googleapis.com/auth/firebase","https://www.googleapis.com/auth/cloud-platform"]
[debug] [2019-09-20T10:07:08.945Z] > authorizing via signed-in user
[debug] [2019-09-20T10:07:08.945Z] [iam] checking project PROJECT for permissions ["cloudfunctions.functions.create","cloudfunctions.functions.delete","cloudfunctions.functions.get","cloudfunctions.functions.list","cloudfunctions.functions.update","cloudfunctions.operations.get","datastore.indexes.create","datastore.indexes.delete","datastore.indexes.list","datastore.indexes.update","firebase.projects.get","firebasehosting.sites.update","firebaserules.releases.create","firebaserules.releases.update","firebaserules.rulesets.create"]
[debug] [2019-09-20T10:07:08.946Z] >>> HTTP REQUEST POST https://cloudresourcemanager.googleapis.com/v1/projects/PROJECT:testIamPermissions

[debug] [2019-09-20T10:07:09.070Z] <<< HTTP RESPONSE 200
[debug] [2019-09-20T10:07:09.071Z] >>> HTTP REQUEST GET https://firebase.googleapis.com/v1beta1/projects/PROJECT

[debug] [2019-09-20T10:07:09.271Z] <<< HTTP RESPONSE 200
[debug] [2019-09-20T10:07:11.711Z] TypeError: Cannot read property 'wanted' of undefined
at /Users/USER/.nvm/versions/node/v8.15.0/lib/node_modules/firebase-tools/lib/checkFirebaseSDKVersion.js:37:51
at <anonymous>
at process._tickDomainCallback (internal/process/next_tick.js:229:7)

@dgobaud
Copy link
Author

dgobaud commented Sep 26, 2019

@abeisgoat any chance you got to look at the deploy problem? I'm talking to Firebase Support and they don't seem to be able to reproduce but it seems like something has changed on the server breaking 6.8.0.

@samtstern
Copy link
Contributor

@dgobaud if you have the issue with "TypeError: Cannot read property 'wanted' of undefined" I believe that was a problem with npm version 6.10.0.

Run npm --version and make sure you have 6.10.1 or later. If you don't, run:

npm install -g npm@6.10.1

@dgobaud
Copy link
Author

dgobaud commented Sep 30, 2019

@samtstern I'm using 6.11.3. I got support to reproduce the issue. If you try to deploy using firebase-functions@latest with latest firebase-tools it causes the problem

@dgobaud
Copy link
Author

dgobaud commented Oct 1, 2019

@abeisgoat @samtstern I heard back from support they said to check here for any update. Is there any update on what is going to be done about the code reload problem in general? Not being able to upgrade is becoming a big problem.

@samtstern
Copy link
Contributor

@dgobaud sorry to hear this is becoming such a problem for you. We have not made much progress on changing this behavior.

If you really need functions emulation you could try Google Cloud's "functions framework" which may be useful:
https://github.com/GoogleCloudPlatform/functions-framework-nodejs

@dgobaud
Copy link
Author

dgobaud commented Oct 3, 2019

@samtstern thanks for replying. So the idea is upgrade firebase-tools for deployment and other stuff and use this functions-framework thing to run the emulator/simulator locally and it'll work? I'm guessing it doesn't do the code reload? If functions-framework is the official emulator of the Functions service/product and it doesn't do code reload isn't that another strong signal that Firebase shouldn't either because it runs on Functions and doesn't change the underlying restart etc functionality?

@Dean-NC
Copy link

Dean-NC commented Oct 4, 2019

I'm confused by some of the recent comments, but I do know that I'm on 7.3.1, and normal node caching of modules doesn't work when the functions are served locally:
either with firebase emulators:start --only functions or firebase serve --only functions.

@dgobaud
Copy link
Author

dgobaud commented Oct 4, 2019

@Dean-NC I'm not exactly sure what node module caching is but almost certain yes it is this Firebase emulator bug - it really needs fixed :(

@dgobaud
Copy link
Author

dgobaud commented Oct 17, 2019

@samtstern @abeisgoat is there any update on this by chance or a response to the point that "If functions-framework is the official emulator of the Functions service/product and it doesn't do code reload isn't that another strong signal that Firebase shouldn't either because it runs on Functions and doesn't change the underlying restart etc functionality?"

Thank you

@samtstern
Copy link
Contributor

@dgobaud no there is no news, I promise I will update this issue when I have some. I really appreciate your enthusiasm about this, I hope you can understand that while there are drawbacks of the way we're running code there are also many benefits for others.

@dgobaud
Copy link
Author

dgobaud commented Oct 17, 2019

@samtstern ok how about adding an option to run the old way? the code really isn't written such that you can just not do the process kill or something and the server keeps running and will process more requests?

@samtstern samtstern assigned samtstern and unassigned abeisgoat Oct 17, 2019
@samtstern
Copy link
Contributor

@dgobaud it's not possible to run the old way right now. The old way depended on cloud-functions-emulator which is dead and doesn't support modern Node code.

We would provide you that option if we could. I am going to look into this today and see what we can achieve in the short term.

@samtstern
Copy link
Contributor

I am working on this in #1733, early results are very promising I was able to execute a second request without re-loading code. A lot of work to do around failure cases though.

@dgobaud
Copy link
Author

dgobaud commented Oct 17, 2019

@samtstern awesome!

@shaunluttin
Copy link

shaunluttin commented Oct 23, 2019

We hit this problem when upgrading Firebase. Previously, our 4.5 second cold-start was reasonable for our use case. Cold-start was rare in production.

Now, we are developing against a 4.5 second cold-start between every request (even between a CORS OPTIONS and GET request). This is not manageable for development.

We now have to choose among:

  1. not upgrading Firebase,
  2. upgrading Firebase with a very slow local development environment,^
  3. upgrading Firebase and doing unplanned work to improve our cold-start performance.

We would appreciate a migration path for the upgrade that lets us continue rapid development without the unplanned work on our cold-start performance.

^ This raises a fourth option: (4) upgrading Firebase and doing most development testing against a remote instance, which would cost us more money.

@samtstern
Copy link
Contributor

@shaunluttin thank you for sharing and sorry you had a bad experiece! We are getting really close to fixing this behavior in #1733 so I think the best bet is to wait.

@samtstern
Copy link
Contributor

@dgobaud @ChromeQ @lookfirst @Dean-NC @shaunluttin this has been released in 7.6.2, please give this a try and let us know if it's working for you!

@dgobaud
Copy link
Author

dgobaud commented Oct 29, 2019

will test it!

@lookfirst
Copy link

This is amazing! It works wonderfully. Huge round of thanks for this!

The only weirdness that I'm noticing is that it appears as though any exception causes a reload. For example... my app did this when I first started things today:

  1. First request comes in... initialize database connection to postgres with an express middleware which puts the connection into a global.
  2. Handle web request through apollo graphql server.
  3. Within apollo, validate the firebase auth idToken the client sent and realize it is expired.
  4. throw exception on invalid token which causes server to tell the client to fetch a new id token and retry again.
  5. Init database connection again.
  6. Successfully finish request with new idToken.

Totally not the end of the world, but interesting to watch how the server deals with things. Does this match up with your expectations?

Even if the exception is caught, should the server cause a reload?

@samtstern
Copy link
Contributor

samtstern commented Oct 29, 2019 via email

@lookfirst
Copy link

Sorry, I'm super busy with my own project right now... =( The test is basically to see if caught exceptions cause a reload. =) I'm pretty sure apollo catches the exception...

@Dean-NC
Copy link

Dean-NC commented Oct 29, 2019

What's the difference between installing with npm and with using firebase-tools-win.exe or the -instant-win.exe, and what's the difference between instant/non-instant?
I searched Google, but absolutely nothing.

@samtstern
Copy link
Contributor

@Dean-NC the difference is that with npm you need npm and node on your system while the exe files are stand-alone binaries that allow you to run firebase-tools without installing any dependencies.

(for future questions please open a new thread, let's keep this one on topic)

@Dean-NC
Copy link

Dean-NC commented Oct 29, 2019

@samtstern Thanks...will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants