Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research item: User-defined error codes in watchdog #157

Closed
alexellis opened this issue Aug 29, 2017 · 18 comments
Closed

Research item: User-defined error codes in watchdog #157

alexellis opened this issue Aug 29, 2017 · 18 comments

Comments

@alexellis
Copy link
Member

alexellis commented Aug 29, 2017

Expected Behaviour

Return 200/404/400 etc from your function code.

Current Behaviour

200 / 400 / 500 are supported but used internally to the watchdog only

Calling os.exit with a non-zero code may also cause a 500 to be returned.

Possible Solutions

  • Allow user to set environment such as Http_Status_Code and let the watchdog read this before returning. (Appears this doesn't work)
  • Read the "exit code" instead (note - this only goes between 0-255)
  • Marshal a custom format for responses (adds work / non-standard)

Workaround

Return 200 with a well-formed body - let that indicate whether it was successful.

@alexellis
Copy link
Member Author

I wrote a prototype for pulling back environment variables set within a Node.js script.. it didn’t work. All I saw were the environmental variables I inputted (despite seeing them set successfully)

@alexellis
Copy link
Member Author

process.env.Http_Status_Code = 404;
console.log(process.env);
	if ri.headerWritten == false {
		execTime := time.Since(startTime).Seconds()
		w.Header().Set("X-Duration-Seconds", fmt.Sprintf("%f", execTime))
		ri.headerWritten = true

		statusCode := getStatusCode(targetCmd.Env)

		w.WriteHeader(statusCode)
		w.Write(out)
	}

	if config.debugHeaders {
		header := w.Header()
		debugHeaders(&header, "out")
	}
}

func getStatusCode(env []string) int {
	code := http.StatusOK
	fmt.Println(env)
	for _, e := range env {
		key := e[0:strings.Index(e, "=")]
		value := e[strings.Index(e, "=")+1:]
		if key == "Http_Status_Code" {
			val, err := strconv.Atoi(value)
			if err != nil {
				log.Println(err)
			}
			code = val
		}

	}

	return code
}

@Marak
Copy link

Marak commented Aug 30, 2017

@alexellis -

I've spent a lot of time researching and trying to solve this specific problem. The best solution I could think of was opening up FD 3 on the pipe and sending JSON messages back and forth.

We've got this technique working in production for 20 or so programming languages. For popular languages like Node or Python, we are able to completely wrap existing HTTP / WSGI interfaces so that user functions can run like Node.js middlewares or python wsgi applications.

Let me know if you have any questions. I'm interested to see how you end up solving this problem.

@Marak
Copy link

Marak commented Aug 30, 2017

To be clear, FD 0, 1, and 2 are mapped to: STDIN, STDOUT, STDERR.

I've been using FD 3 as an additional communication channel for sending JSON between the parent and child process. Not sure if it's best practice, but it has been working and allowing for expressive APIs inside functions.

I don't see any specific reason why you couldn't just write the status header out directly to STDOUT and parse that output, but in our case we ended up not wanting to send anything but the response body itself over STDOUT.

@alexellis
Copy link
Member Author

Thanks for your input @Marak 👍 - an arbitrary protocol like reading JSON or HTTP over one of the streams could do the job for a managed language along with a set of client libraries. I am trying to see if this can be solved generically so that binaries which know nothing of custom protocols can still operate for instance: ffmpeg or imagemagick.

@Marak
Copy link

Marak commented Aug 31, 2017

Yes. We still have full support for arbitrary binaries using this method.

@Marak
Copy link

Marak commented Aug 31, 2017

We also don't require a set of client libraries for users. Everything is handled by custom handler binaries we ship with our project per language. It's all seamless to the user ( no client libraries required in functions ), but does require a bit of work on our end to build these handlers.

Let me know if you want to discuss this or other issues further, I've been running an open-source faas platform since 2014 ( before AWS Lambda existed ). I didn't want to start posting links to our projects on your Github Issues.

@alexellis alexellis self-assigned this Sep 1, 2017
@alexellis
Copy link
Member Author

Great to hear about your success @Marak. Are you forking a new process per request or reusing the same one? If you were re-using the same process, then surely some code needs to be in your language runtime to output the "custom JSON" format on stderr?

@Marak
Copy link

Marak commented Sep 1, 2017

Not outputting custom JSON on stderr ( that is FD 2 ), we are using FD 3 which is an unused pipe. Overloading stderr is not a good idea ( we tried it ).

We are currently forking a new process per request. The Bash helloworld service takes about 20ms to complete it's entire request / spawn / response lifecycle. I've considered preforking and doing a FastCGI type approach, but our scaling strategy hasn't required this ( yet ). Preforking shouldn't make much of a difference to the code structure either way.

We do have code in our language run-times which maintains the FD 3 comm channel and wrap's it in the applicable interface per language. We do the same for incoming HTTP parameters to create nested data-structures in the specific language ( so user's don't have to think about parsing and deserializing ARGV for CGI environment variables ). This is done through run-time code-injection and meta-programming. For compiled languages like gcc, we perform a one-time compilation step which is cached based on the checksum of the user function.

@alexellis
Copy link
Member Author

We should setup a hangout to compare notes. I think I have a strategy which could help make this even faster.

@Marak
Copy link

Marak commented Sep 1, 2017

That sounds good to me. I'd be interested to see what the absolute minimal baseline is for processing a request and spawned process. The fastest I've gotten without pre-forking is about ~20ms.

Im on Google Hangouts as: marak.squires@gmail.com

@alexellis
Copy link
Member Author

@johnmccabe - what do other frameworks do? consult CNCF white-paper.

@JockDaRock - API Gateway still gives 200 OK

@alxyng
Copy link

alxyng commented Sep 18, 2017

What would also be a nice feature would be the ability to add custom headers to the response for example Cache-Control or Content-Type. I had a couple of ideas how this could be done, one of which is to follow the CGI/FastCGI approach of writing the entire HTTP response out. This does however have the disadvantage of complicating writing a simple response out but @alexellis i know you're working on some "fast fork" stuff so that maybe compliment this.

@alexellis
Copy link
Member Author

Yes - fast fork allows full headers to be passed in either direction including error codes. It comes at a cost to the maintainers of the project but not to the consumer. The consumer would likely write the same handler they do today with the regular CGI.

This approach cannot be used when running binaries unless shell-script trickery is created too.

@alexellis
Copy link
Member Author

This is an example of how I did it for my Dockercon demos - https://github.com/alexellis/faas-dockercon/blob/master/GetAvatar/handler.py

@thucnc
Copy link

thucnc commented Feb 2, 2018

I still unable to use custom status code, say 400/401/404, in my handler (even with custom Dockerfile like the above 'GetAvatar' sample). All the response fields including the statusCode come into the body, hence the status is always 200.
So may you confirm that this is still a known issue that hasn't been fixed yet ?

@alexellis
Copy link
Member Author

This thread is already several months old. @thucnc I was referring to serialising Content-Type in that example. See: https://github.com/alexellis/faas-dockercon/blob/master/GetAvatar/handler.py#L27

There is no "issue" to be fixed, the behaviour is by design and a constraint you will need to work within, however there is some more context you will find useful:

We have an alternative option which allows HTTP status codes to be returned from the function - but for the time being I'd encourage you to do the following:

  • if you let the code exit normally you'll get a status of 200 - then inspect the body for the status. For instance you can produce a JSON payload like this:
{
 "result" : "there are 10 items waiting in your cart",
 "status": 200
}
{
 "result" : "we could not find your cart",
 "status": 404
}
  • If you have an error then call system.exit with a non-zero code, this gives a 500 error

When we have an enhancement ready to return custom HTTP status codes we'll let you know.

Important: Bare in mind that the option above (JSON bodies) is not a HTTP status code as you know it but still provides the context you need to determine success or failure. This is the case with many FaaS private/cloud frameworks - you're calling a function which happens to use HTTP for transport - it is not and should not be confused for a HTTP framework or REST API.

@alexellis
Copy link
Member Author

Anyone landing here now should use our of-watchdog, which supports a full HTTP request and response.

@openfaas openfaas locked and limited conversation to collaborators Oct 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants
@Marak @alxyng @thucnc @alexellis and others