Skip to content

asyncio + h11 (was #58)#60

Merged
denismakogon merged 34 commits intomasterfrom
delayed-import
Feb 6, 2019
Merged

asyncio + h11 (was #58)#60
denismakogon merged 34 commits intomasterfrom
delayed-import

Conversation

@denismakogon
Copy link
Copy Markdown
Member

@denismakogon denismakogon commented Dec 4, 2018

What's being done?

Customer's code import happens right before the first calls.

Why it's so necessary?

With this change, the cold start only affected by FDK itself,
but not by the customer's code.

Minimum Python requirement

3.7.1

Impacts

Fn CLI due to the boilerplate in it.

Based on #62 and #61

@zootalures
Copy link
Copy Markdown
Member

I don't think this is necessary if we manage timeouts better around function initialisation.

@denismakogon
Copy link
Copy Markdown
Member Author

This is still necessary, no matter what. Because customer's code can take an enormous amount of time on the module initialization.

@denismakogon denismakogon changed the title Delayed import asyncio + h11 (was #58) Dec 4, 2018
@denismakogon
Copy link
Copy Markdown
Member Author

@rdallman i've got this to work, need to fix some tests.

@rdallman
Copy link
Copy Markdown
Contributor

rdallman commented Dec 5, 2018

not against the module loading thing in principle, it does seem like a problem we'll run into pretty quickly. that being said, this patch is now 2 kinda big (scope, not code necessarily) ideas wrapped into one plus large amounts of code are just being tossed out (which is great), and it's hard to sniff out what's what exactly in this PR now to really review it. any chance you could stage the removals first, then have a couple patches for the changing user module load and asyncio stuff? it would make it easier for a reviewer who may not be very familiar with the inner workings of the fdk (I believe there is only one person really, and we need to get more people into it, is what I'm hinting at).

thanks for fixing up h11 stuff, that's great, I hope it's a good solution at least.

Comment thread fdk/http_stream.py Outdated
Comment thread fdk/http_stream.py Outdated
@denismakogon
Copy link
Copy Markdown
Member Author

not against the module loading thing in principle, it does seem like a problem we'll run into pretty quickly. that being said, this patch is now 2 kinda big (scope, not code necessarily) ideas wrapped into one plus large amounts of code are just being tossed out (which is great), and it's hard to sniff out what's what exactly in this PR now to really review it

Well, that’s true, however, delayed import change is small and stratightforward. I’ll add better explanation of what this PR consists of.

@rdallman
Copy link
Copy Markdown
Contributor

rdallman commented Dec 5, 2018

delayed import change is small and stratightforward.

screen shot 2018-12-05 at 10 46 59 am

seems to be most of this PR though in its current capacity? would appreciate splitting anyway (I think it's mostly orthogonal removals, which would be nice to separate out, too)

@denismakogon
Copy link
Copy Markdown
Member Author

@rdallman Okay, delayed import change itself is small, honestly, but I removed lots of stale code that nobody uses, I'll split that into a separate PR.

@denismakogon
Copy link
Copy Markdown
Member Author

denismakogon commented Dec 6, 2018

Okay, PR is ready.

What you need to test it?

  1. Install build dependencies: pip install wheel
  2. Build FDK: PBR_VERSION=test python3.7 setup.py bdist_wheel
  3. Create a function with the following dockerfile:
FROM fnproject/python:3.7.1-dev as build-stage

ADD . /function
WORKDIR /function
# add all dependencies you have to install
RUN pip3 install --target /python/  --no-cache --no-cache-dir fdk-test-py3-none-any.whl 

RUN rm -fr ~/.cache/pip /tmp* requirements.txt func.yaml Dockerfile .venv

FROM fnproject/python:3.7.1

COPY --from=build-stage /function /function
COPY --from=build-stage /python /python
ENV PYTHONPATH=/python

ENTRYPOINT ["/python/bin/fdk", "/function/func.py", "handler"]

func.py:

import json
import io

from fdk import response


def handler(ctx, data=None):
    name = "World"
    try:
        body = json.loads(data.getvalue())
        name = body.get("name")
    except (Exception, ValueError) as ex:
        print(str(ex))

    return response.Response(
        ctx, response_data=json.dumps(
            {"message": "Hello {0}".format(name)}), 
        headers={"Content-Type": "application/json"}
    )
  1. move FDK wheel (suppose to be here dist/fdk-test-py3-none-any.whl) to function's folder
  2. call deploy
  3. invoke

@rdallman
Copy link
Copy Markdown
Contributor

rdallman commented Dec 6, 2018

this patch still seems to rely on delayed import even after rebasing (I didn't push), hard to untangle right now... guess we do the other one first? I'll make some comments there...

@rdallman
Copy link
Copy Markdown
Contributor

rdallman commented Dec 6, 2018

What you need to test it?

is this the process now for any python fdk function? what's with wheel? I looked up what it is - but is it required? this seems more cumbersome than creating these 2 files in a folder and hitting fn init; fn deploy?

Comment thread fdk/context.py
Comment thread requirements.txt
@rdallman rdallman mentioned this pull request Dec 7, 2018
@denismakogon
Copy link
Copy Markdown
Member Author

is this the process now for any python fdk function? what's with wheel? I looked up what it is - but is it required? this seems more cumbersome than creating these 2 files in a folder and hitting fn init; fn deploy?

No, that’s for you basically. People would still get it from PYPI.

@denismakogon
Copy link
Copy Markdown
Member Author

rebased on master.

@fn-bot
Copy link
Copy Markdown

fn-bot commented Dec 9, 2018

CLA Bot

Thank you for your submission! It appears that the following authors have not signed our Contributor License Agreement:

  • Denis Makogon (The email used in the commit is not linked to your GitHub account. As a result we cannot verify that you have signed the CLA. If you have signed already, please let us know in our community Slack. Thanks!)

Please do so now by visiting http://www.oracle.com/technetwork/community/oca-486395.html

Once complete, let us know in our community Slack and we’ll send you an Fn T-shirt.

We are working on modernizing the CLA process into a digital signature but it isn’t quite ready yet.

Thank you for being a part of the Fn Community!

@denismakogon denismakogon force-pushed the delayed-import branch 2 times, most recently from 57db490 to 9d12f44 Compare December 9, 2018 10:34
@fn-bot
Copy link
Copy Markdown

fn-bot commented Dec 9, 2018

CLA Bot

All committers have signed the CLA.

Comment thread fdk/http/event_handler.py Outdated
Comment thread fdk/http/event_handler.py Outdated
@rdallman
Copy link
Copy Markdown
Contributor

@denismakogon so i've ventured down a new road with httptools package https://github.com/fnproject/fdk-python/tree/delayed-import-42 - it's kinda like we discussed a while ago, it uses a C (nodejs) request parser (which means cython) - package itself is super easy to use (way easier than h11), and with a few lines of cleaning up to handle the body and run the function instead of just writing a toy 'hello world' response, I think that it will work, need to add a few more callbacks - if you're interested then old sanic server https://github.com/hhstore/annotated-py-sanic/blob/master/sanic-0.1.9/sanic/server.py is a pretty good reference for usage of this package really, before asgi stuff was a thing. posting cuz i'm running out for a bit and don't want to go too far down this path if there's some obvious issue with it you know of, also, you're welcome to branch off and try yourself..

with the linked branch up there i can reliably run functions via fn on my laptop, 20 at a time, without 502/504 (on startup or otherwise), with keep alives, etc. I don't think that cython is an issue for us since fn test is no longer a thing, pretty much users need to bundle their code into a container to test against fn (compiling on windows being the issue)? at least, this seems a viable path? httptools seems like the preferred parser anyway, aiohttp is using it (so master of python fdk at this moment too)

@denismakogon
Copy link
Copy Markdown
Member Author

so i've ventured down a new road with httptools package https://github.com/fnproject/fdk-python/tree/delayed-import-42 - it's kinda like we discussed a while ago, it uses a C (nodejs) request parser (which means cython) - package itself is super easy to use (way easier than h11), and with a few lines of cleaning up to handle the body and run the function instead of just writing a toy 'hello world' response, I think that it will work, need to add a few more callbacks - if you're interested then old sanic server https://github.com/hhstore/annotated-py-sanic/blob/master/sanic-0.1.9/sanic/server.py is a pretty good reference for usage of this package really, before asgi stuff was a thing. posting cuz i'm running out for a bit and don't want to go too far down this path if there's some obvious issue with it you know of, also, you're welcome to branch off and try yourself..

I've been this path as well, I had (and having it one more time ) running coroutines within this code. We eventually copying all pieces of code from sanic but within a smaller amount of code.


For now, there's only one problem. I/We copied a piece of sanic, which is ok (but i'm not proud of that). The only thing is async-http lib that i've made doesn't start within 3 seconds.

So, I'd try to make it lighter, have some basic ideas.

@rdallman
Copy link
Copy Markdown
Contributor

I've been this path as well, I had (and having it one more time ) running coroutines within this code. We eventually copying all pieces of code from sanic but within a smaller amount of code.

yeah. i wonder if maybe clamping to old versions of sanic before cruft would start faster (slightly different than async-http lib which is based on newest?)? it seems to be pretty complete solution in about 100 LOC for httptools + asyncio.Protocol (basically copied version of old sanic), and since there's not a lot of code there locking to an old version doesn't seem all that bad (less room for bugs), but the case for just vendoring it and/or calling it our own so that we can modify it and/or reduce out what we don't need is there too I guess (we don't need lots of stuff... pipelining, cookies, multipart). i'm not sure which is better there, obviously it's not ideal to not have an off the shelf solution in this case, but the code itself here is really straight forward and small (httptools and asyncio do all the hard stuff, we're just tying it together)

seems like if i get something thrown together we can test with coroutines then? async-http lib seems slightly different but could be a good option too. one difference is async-http version has flow control, but this is pretty easy to add too (should fix coroutine issue I think?). if we can get the start time down on async-http lib then let's try it -- kinda going at it both ways, tearing it down vs building from nothing, not sure which is better but either may work. keep us posted please...

@denismakogon
Copy link
Copy Markdown
Member Author

@rdallman newer PR uses 0.0.4 of async-http that is a lot thinner and starts within 2.5 seconds:

import time: self [us] | cumulative | imported package
import time:      5710 |       5710 | zipimport
import time:      2704 |       2704 | _frozen_importlib_external
import time:       191 |        191 |     _codecs
import time:      5452 |       5643 |   codecs
import time:      2727 |       2727 |   encodings.aliases
import time:     95218 |     103587 | encodings
import time:      1373 |       1373 | encodings.utf_8
import time:       183 |        183 | _signal
import time:      1035 |       1035 | encodings.latin_1
import time:       211 |        211 |     _abc
import time:      1516 |       1727 |   abc
import time:     90188 |      91914 | io
import time:        85 |         85 |       _stat
import time:      1473 |       1558 |     stat
import time:      1068 |       1068 |       genericpath
import time:     90641 |      91709 |     posixpath
import time:    101234 |     101234 |     _collections_abc
import time:      6100 |     200600 |   os
import time:      2775 |       2775 |   _sitebuiltins
import time:       647 |        647 |   sitecustomize
import time:       381 |        381 |   usercustomize
import time:     98664 |     303064 | site
import time:      2969 |       2969 |     types
import time:      2207 |       2207 |     _collections
import time:     96113 |     101288 |   enum
import time:       126 |        126 |     _sre
import time:      1595 |       1595 |       sre_constants
import time:    101115 |     102709 |     sre_parse
import time:     96496 |     199330 |   sre_compile
import time:       107 |        107 |     _functools
import time:       105 |        105 |         _operator
import time:      2769 |       2873 |       operator
import time:     91269 |      91269 |       keyword
import time:       697 |        697 |         _heapq
import time:      6535 |       7231 |       heapq
import time:       847 |        847 |       itertools
import time:      2007 |       2007 |       reprlib
import time:     98329 |     202552 |     collections
import time:    191860 |     394519 |   functools
import time:       984 |        984 |   _locale
import time:      3035 |       3035 |   copyreg
import time:      7141 |     706295 | re
import time:       462 |        462 |           collections.abc
import time:      1413 |       1413 |             concurrent
import time:       151 |        151 |                 time
import time:      1348 |       1348 |                       token
import time:     91723 |      93071 |                     tokenize
import time:      1297 |      94367 |                   linecache
import time:     93380 |     187746 |                 traceback
import time:    102268 |     102268 |                 warnings
import time:     89322 |      89322 |                   _weakrefset
import time:      9670 |      98992 |                 weakref
import time:       136 |        136 |                   _string
import time:      6377 |       6512 |                 string
import time:     94020 |      94020 |                 threading
import time:       154 |        154 |                 atexit
import time:     10019 |     499859 |               logging
import time:      5272 |     505131 |             concurrent.futures._base
import time:     94648 |     601190 |           concurrent.futures
import time:       791 |        791 |             _socket
import time:       448 |        448 |               math
import time:     88812 |      88812 |               select
import time:      8784 |      98043 |             selectors
import time:       755 |        755 |             errno
import time:    102346 |     201934 |           socket
import time:      2963 |       2963 |             signal
import time:     91037 |      91037 |             _posixsubprocess
import time:     96566 |     190565 |           subprocess
import time:      1697 |       1697 |             _ssl
import time:       364 |        364 |                 _struct
import time:       631 |        995 |               struct
import time:     93502 |      93502 |               binascii
import time:      3756 |      98252 |             base64
import time:    106763 |     206710 |           ssl
import time:      2020 |       2020 |           asyncio.constants
import time:       217 |        217 |                   _opcode
import time:     84364 |      84581 |                 opcode
import time:    103268 |     187848 |               dis
import time:      3849 |       3849 |                 importlib
import time:      1273 |       5121 |               importlib.machinery
import time:    105807 |     298775 |             inspect
import time:       867 |        867 |               asyncio.format_helpers
import time:      1261 |       2127 |             asyncio.base_futures
import time:       596 |        596 |             asyncio.log
import time:      4169 |     305665 |           asyncio.coroutines
import time:       279 |        279 |               _contextvars
import time:       500 |        778 |             contextvars
import time:      2018 |       2018 |               asyncio.base_tasks
import time:      1639 |       3657 |             _asyncio
import time:     91291 |      95725 |           asyncio.events
import time:     91512 |      91512 |           asyncio.futures
import time:      2668 |       2668 |           asyncio.protocols
import time:      1240 |       1240 |             asyncio.transports
import time:     91889 |      93129 |           asyncio.sslproto
import time:      9826 |       9826 |           asyncio.tasks
import time:     95158 |    1896559 |         asyncio.base_events
import time:     94196 |      94196 |         asyncio.locks
import time:      2049 |       2049 |         asyncio.runners
import time:      3715 |       3715 |         asyncio.queues
import time:     91571 |      91571 |         asyncio.streams
import time:      3980 |       3980 |         asyncio.subprocess
import time:      1875 |       1875 |           asyncio.base_subprocess
import time:     90590 |      90590 |           asyncio.selector_events
import time:      8476 |     100940 |         asyncio.unix_events
import time:      1668 |    2194676 |       asyncio
import time:     93203 |      93203 |       fdk.constants
import time:       275 |        275 |       fdk.customer_code
import time:       850 |        850 |       fdk.log
import time:       490 |        490 |       fdk.http
import time:       548 |        548 |       fdk.http.event_handler
import time:       457 |        457 |       async_http
import time:       657 |        657 |         async_http.exceptions
import time:       544 |        544 |               urllib
import time:    103613 |     104157 |             urllib.parse
import time:      1081 |       1081 |             async_http.cookie
import time:       318 |        318 |                   _json
import time:      1713 |       2030 |                 json.scanner
import time:      3945 |       5974 |               json.decoder
import time:      3212 |       3212 |               json.encoder
import time:     91940 |     101126 |             json
import time:       771 |     207133 |           async_http.response
import time:       381 |     207513 |         async_http.error_handler
import time:       394 |        394 |                 httptools.parser.errors
import time:       567 |        961 |               httptools.parser.parser
import time:       334 |       1294 |             httptools.parser
import time:       441 |       1735 |           httptools
import time:       858 |       2593 |         async_http.request
import time:       160 |        160 |             uvloop
import time:       688 |        847 |           async_http.protocol
import time:      1300 |       2146 |         async_http.server
import time:     90390 |     303298 |       async_http.app
import time:       938 |        938 |       async_http.router
import time:      1648 |    2596378 |     fdk
import time:       403 |    2596781 |   fdk.scripts
import time:      1022 |    2597802 | fdk.scripts.fdk
import time:     91829 |      91829 |   importlib.abc
import time:      5755 |       5755 |   contextlib
import time:      2167 |      99750 | importlib.util
2019-01-29 00:05:30,503 - asyncio - DEBUG - Using selector: EpollSelector

Comment thread fdk/customer_code.py

@staticmethod
def import_from_source(func_module_path):
from importlib import util
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trying to reduce imports time by delaying code initialization as close as possible to the time of the actual call.

Comment thread fdk/http/event_handler.py Outdated
logger.info("request execution completed")
return func_response

from async_http import response
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, import response only you need that, import are cachable, so, only 1st import is heavy, the rest are not.

@rdallman
Copy link
Copy Markdown
Contributor

cool. small bug:

$ cli invoke testapp test-func
b'{"message": "Hello World"}'

b'%s' printing somewhere

otherwise, testing it now...

@rdallman
Copy link
Copy Markdown
Contributor

mmm, still seeing a few:

{"message":"Container initialization timed out, please ensure you are using the latest fdk / format and check the logs"}                                     
                                                                                                                                                                                                     
Fn: Error calling function: status 504 

:(

with the from scratch edition (linked above) it's not having issues opening the socket on time, at least.

does sanic based lib have long init sequence we could minimize too? any other ideas?

@denismakogon
Copy link
Copy Markdown
Member Author

with the from scratch edition (linked above) it's not having issues opening the socket on time, at least.

that's literally due to the amount of code

does sanic based lib have long init sequence we could minimize too?

without having any logs from the container startup I can't tell you more. I'm not seeing a lot of code in front of asyncio.

the number of the dependencies were reduced to 1.

I need to know exactly when Fn aborts startup procedure and logs would be very useful.

@denismakogon
Copy link
Copy Markdown
Member Author

there's nothing more to reduce in both async-http or fdk in terms of startup time, 2.49 is the best time.

@rdallman
Copy link
Copy Markdown
Contributor

from what i understand wrt how python loads modules (read: unintelligibly), can we include the async-http lib as a subdirectory in fdk-python and import it from there, instead of from requirements.txt?

@denismakogon
Copy link
Copy Markdown
Member Author

Yes, I can do that.

@rdallman
Copy link
Copy Markdown
Contributor

@denismakogon any updates I can test? i think i'm probably 2-3 days out from a bottoms up one, but won't pursue that if we can get something else working meanwhile

@denismakogon
Copy link
Copy Markdown
Member Author

not yet, need some time, will work on it tomorrow.

@rdallman
Copy link
Copy Markdown
Contributor

rdallman commented Feb 4, 2019

can we include the async-http lib as a subdirectory in fdk-python and import it from there, instead of from requirements.txt?

confirmed that this approach seems to resolve the 504 from importing it in requirements.txt. need to fix this bug and probably test a little more extensively, but would like to close this out...

@denismakogon
Copy link
Copy Markdown
Member Author

need to fix this bug

that was fixed with c52b7bc

Copy link
Copy Markdown
Contributor

@rdallman rdallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - i tested a few things this morning; concurrency, latency, large payloads, connections. all looks pretty good, no 502 or 504. i can start a container on my laptop with this in < 800ms now, which is much better than 2500ms number that was thrown out - i think import of subdir is lots faster than loading external module. in any case, I haven't tested the minutia (headers, etc) but that should all be okay I hope... hopefully it's tested somewhere in here, or the users will debug for us - at this point, satisfied that 2 issues were taken care of and this will ease the onboarding process for python users who end up getting failing functions in their first go round.

i forget now whether we broke users that were using main or not and if we provided clear instructions to migrate (not just a blog, but readme). it would be nice to let them run main with a warning just to move things over, but it's nice to force migration too - it would be nice to have the migration process clearly outlined before merging here (maybe it's here and i missed it scanning through).

thanks for your persistence here @denismakogon and thanks others for your time in testing and reviewing this. i've a newfound loathing for this language and i think i've sworn it off for good now.

@denismakogon
Copy link
Copy Markdown
Member Author

i forget now whether we broke users that were using main or not and if we provided clear instructions to migrate (not just a blog, but readme)

code check, readme, docs, tutorials and blogpost - suppose to be enough to outline the migration process.

all looks pretty good, no 502 or 504. i can start a container on my laptop with this in < 800ms now, which is much better than 2500ms number that was thrown out - i think import of subdir is lots faster than loading external module.

i still don't like the idea of moving the whole async-http here, but i do see a need in it, unfortunately, there's no way to speed up the whole importing system for now.

thanks for your persistence here @denismakogon and thanks others for your time in testing and reviewing this. i've a newfound loathing for this language and i think i've sworn it off for good now.

thanks to all of you for being around and helping to shape it up to the final form.

@denismakogon
Copy link
Copy Markdown
Member Author

I haven't tested the minutia (headers, etc) but that should all be okay I hope... hopefully it's tested somewhere in here, or the users will debug for us - at this point, satisfied that 2 issues were taken care of and this will ease the onboarding process for python users who end up getting failing functions in their first go round.

all of that covered by unit tests.

@denismakogon denismakogon merged commit 3218f65 into master Feb 6, 2019
@denismakogon denismakogon deleted the delayed-import branch February 6, 2019 17:10
@rdallman
Copy link
Copy Markdown
Contributor

rdallman commented Feb 6, 2019

woooooo

giphy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants