Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with performance and iterating over tar policy having large number of files #241

Open
prashil-g opened this issue Jun 5, 2022 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@prashil-g
Copy link

prashil-g commented Jun 5, 2022

Describe the bug
Seeing timeout and crash of worker threads when policy tar containing large numeber of policy is pushed

To Reproduce
running the api source docker compose reference provided in the permitio github repo and sending policy bundle attached
bundle.tar.gz


[10:54](https://permit-io.slack.com/archives/C03BJREJGBU/p1651166648079409)
version: "3.8"
services:
  # When scaling the opal-server to multiple nodes and/or multiple workers, we use
  # a *broadcast* channel to sync between all the instances of opal-server.
  # Under the hood, this channel is implemented by encode/broadcaster (see link below).
  # At the moment, the broadcast channel can be either: postgresdb, redis or kafka.
  # The format of the broadcaster URI string (the one we pass to opal server as `OPAL_BROADCAST_URI`) is specified here:
  # https://github.com/encode/broadcaster#available-backends
  broadcast_channel:
    image: postgres:alpine
    environment:
      - POSTGRES_DB=postgres
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres

  opal_server:
    # by default we run opal-server from latest official image
    image: permitio/opal-server:latest
    environment:
      # the broadcast backbone uri used by opal server workers (see comments above for: broadcast_channel)
      - OPAL_BROADCAST_URI=postgres://postgres:postgres@broadcast_channel:5432/postgres
      # number of uvicorn workers to run inside the opal-server container
      - UVICORN_NUM_WORKERS=4
      # the url of the Api bundle server hosting our policy
      # - you can pass a token if you need to authentication via `POLICY_BUNDLE_SERVER_TOKEN`
      # - in this example we use nginx server that serve static bundle.tar.gz files without token
      # - our bundle server is compatible with OPA bundle server
      # - for more info, see: hthttps://www.openpolicyagent.org/docs/latest/management-bundles/
      - OPAL_POLICY_BUNDLE_URL=http://api_policy_source_server/
      - OPAL_POLICY_SOURCE_TYPE=API
      # - the base path for the local git in Opal server
      - OPAL_POLICY_REPO_CLONE_PATH=~/opal
      # in this example we will use a polling interval of 30 seconds to check for new policy updates (new bundle files).
      # however, it is better to utilize a api *webhook* to trigger the server to check for changes only when the bundle server has new bundle.
      # for more info see: https://github.com/permitio/opal/blob/master/docs/HOWTO/track_an_api_bundle_server.md
      - OPAL_POLICY_REPO_POLLING_INTERVAL=30
      # configures from where the opal client should initially fetch data (when it first goes up, after disconnection, etc).
      # the data sources represents from where the opal clients should get a "complete picture" of the data they need.
      # after the initial sources are fetched, the client will subscribe only to update notifications sent by the server.
      - OPAL_DATA_CONFIG_SOURCES={"config":{"entries":[{"url":"http://host.docker.internal:7002/policy-data","topics":["policy_data"]}]}}
      - OPAL_LOG_FORMAT_INCLUDE_PID=true
    ports:
      # exposes opal server on the host machine, you can access the server at: http://localhost:7002/
      - "7002:7002"
    depends_on:
      - broadcast_channel

  opal_client:
    # by default we run opal-client from latest official image
    image: permitio/opal-client:latest
    environment:
      - OPAL_SERVER_URL=http://opal_server:7002/
      - OPAL_LOG_FORMAT_INCLUDE_PID=true
      - OPAL_INLINE_OPA_LOG_FORMAT=http
    ports:
      # exposes opal client on the host machine, you can access the client at: http://localhost:7000/
      - "7000:7000"
      # exposes the OPA agent (being run by OPAL) on the host machine
      # you can access the OPA api that you know and love at: http://localhost:8181/
      # OPA api docs are at: https://www.openpolicyagent.org/docs/latest/rest-api/
      - "8181:8181"
    depends_on:
      - opal_server
    # this command is not necessary when deploying OPAL for real, it is simply a trick for dev environments
    # to make sure that opal-server is already up before starting the client.
    command: sh -c "./wait-for.sh opal_server:7002 --timeout=20 -- ./start.sh"

  # Demo bundle server to serve the policy
  api_policy_source_server:
    # we use nginx to serve the bundle files
    image: nginx
    # expose internal port 80 to localhost 8000
    ports:
      - 8000:80
    # map files into the docker to edit nginx conf and put the bundle files into the container
    volumes:
      - ./docker_files/bundle_files:/usr/share/nginx/html
      - ./docker_files/nginx.conf:/etc/nginx/nginx.conf

image

Expected behavior
Tar ball containing large number of policy rego should not take more time to untar

Screenshots
image

OPAL version

Issue was identified with tarsafe wrapper. If tarsafe wrapper isnt used, untarring the tar with large number of files is very fast and no worker crash seen
image

@prashil-g prashil-g added the bug Something isn't working label Jun 5, 2022
@orweis
Copy link
Contributor

orweis commented Jun 5, 2022

FYI: @obsd @roekatz @asafc

@prashil-g
Copy link
Author

Hi @orweis Did you find any fix for this?

@obsd
Copy link
Contributor

obsd commented Aug 2, 2022

Hi @prashil-g, thanks for the detailed bug description.
Do you consider the tar file to come from a safe source (no user input on file names etc..)?
I'm investigating it now and it will help me to come up with more possible solutions.

@prashil-g
Copy link
Author

We do trust the source. But that might not be the case for most users

@obsd
Copy link
Contributor

obsd commented Aug 11, 2022

@prashil-g sorry for a long time it took me
I've pushed a suggestion that may help you, I will consult with @orweis and @asafc about it
meanwhile, I would love to hear your thoughts as well
#284
Note that this is a draft PR

@obsd obsd self-assigned this Aug 14, 2022
@obsd obsd linked a pull request Aug 14, 2022 that will close this issue
@gemanor
Copy link
Collaborator

gemanor commented Sep 30, 2024

@orweis have we got more requirements like this after the PR or we can close the issue?

@orweis
Copy link
Contributor

orweis commented Sep 30, 2024

Not that I'm aware of

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants