Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async Service Brokers can return unrestricted operation data in response body #695

Closed
avade opened this issue Oct 10, 2016 · 13 comments
Closed
Labels

Comments

@avade
Copy link

avade commented Oct 10, 2016

Issue

In the Service Broker API there only a database size restriction on the new "operation" string returned on the body of the provisioning and update. This could cause issues with the CC DB filling up if the service broker returns large amounts of data.

Context

Looking at the PR, I quickly searched for size restriction on the operation field, and could only see potential limitation on the cc_db. The schema migration seems to declare the field as a TEXT column (from Sequel datatype). On mysql, the TEXT seems a variable length datatype according to 1 and 2. On postgresql, it is also a variable unlimited length.

Steps to Reproduce

  1. Create an async service broker
  2. On async provision, requests return in the body operation field with a 10mb payload.
  3. See the 10mb payload ends up in the CC DB
  4. Repeat until CC DB is full...

Expected result

CC DB could run out of storage.

Current result

Possible Fix

Limit the payload size for the response body to ~10kb?

name of issue screenshot

?

@cf-gitbot
Copy link

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/131993979

The labels on this github issue will be updated when the story is started.

@gberche-orange
Copy link
Contributor

@avade It would be interesting to double check the handling of other service broker returned data is enforcing a specific size besides the DB schema enforcement:

  • the /catalog endpoint accepts
    • meta-data (I guess mapping to the extra field in the cc-db, currently mapped as a TEXT type)
    • tags stored as TEXT
  • the binding endpoint returns credentials also stored as TEXT type

Large db content injections may also come from the user provided content through the CLI, such as the service tags. However in this specific example, the CC seems to properly control this size: 666eda0

@mikexuu
Copy link
Contributor

mikexuu commented Nov 1, 2016

Hi @avade,

We think it's reasonable to apply a validation on a broker provided operation field. We're hesitant to limit this field too much in case it would break an existing service broker. We think the next course of action is to poll the CF-dev community and ask whether there are any objections to setting a limit of [to be decided with input from @SocalNick. 10KB?].

@michaelxupiv & @utako, CF CAPI Team

@avade
Copy link
Author

avade commented Nov 1, 2016

Sounds like a good next step.

@SocalNick - do you want to poll the community?

On Tue, Nov 1, 2016 at 10:57 PM, Michael Xu notifications@github.com
wrote:

Hi @avade https://github.com/avade,

We think it's reasonable to apply a validation on a broker provided
operation field. We're hesitant to limit this field too much in case it
would break an existing service broker. We think the next course of action
is to poll the CF-dev community and ask whether there are any objections to
setting a limit of [to be decided with input from @SocalNick
https://github.com/SocalNick. 10KB?].

@michaelxupiv https://github.com/michaelxupiv & @utako
https://github.com/utako, CF CAPI Team


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#695 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAeB19oyI4K8vdeIr2xIvO0DJDwXu-umks5q58PUgaJpZM4KSp6V
.

@SocalNick
Copy link
Contributor

This is a very new feature. I'm not sure that polling the community is necessary, shouldn't we just fix it?

@utako
Copy link
Contributor

utako commented Nov 2, 2016

@SocalNick: My concern is that there are already brokers that return operations bigger than 10KB, but I'm fine with this as a new feature if you feel like that's unnecessary.

@SocalNick
Copy link
Contributor

I'd like to fix this bug by reducing the size of broker_provided_operation field in the service_instance_operations table to 10k

luan added a commit that referenced this issue Dec 7, 2016
[GH #695]
[Finishes #131993979]

Signed-off-by: Eric Promislow <eric.promislow@hpe.com>
@SocalNick
Copy link
Contributor

This feels like it's bad because we have a service, but we've lost the provision response. This seems like an undesirable state. Would it be better to truncate the response?

cf cs fake-service-c5f81bae-9657-47a9-8bb9-3d4d3c5a07d0 fake-async-plan nope             1208ms
Creating service instance nope in org system / space system as admin...
FAILED
Server error, status code: 400, error code: 60003, message: The service instance is invalid: broker_provided_operation max_length

cf s                                                                                 1341ms
Getting services in org system / space system as admin...
OK

name   service                                             plan              bound apps   last operation
nope   fake-service-c5f81bae-9657-47a9-8bb9-3d4d3c5a07d0   fake-async-plan

cf curl /v2/service_instances                                                             733ms
{
   "total_results": 1,
   "total_pages": 1,
   "prev_url": null,
   "next_url": null,
   "resources": [
      {
         "metadata": {
            "guid": "3cac0c98-28f6-446a-9a1d-552660213477",
            "url": "/v2/service_instances/3cac0c98-28f6-446a-9a1d-552660213477",
            "created_at": "2016-12-08T22:47:10Z",
            "updated_at": "2016-12-08T22:47:10Z"
         },
         "entity": {
            "name": "nope",
            "credentials": {},
            "service_plan_guid": "0855c61a-ef3a-4984-846f-b070a67cee76",
            "space_guid": "d2357fe4-a9f2-4c76-99d9-321b7668fde4",
            "gateway_data": null,
            "dashboard_url": null,
            "type": "managed_service_instance",
            "last_operation": null,
            "tags": [],
            "service_guid": "c9b03c64-5ef7-47ad-811f-f5502b69dbb1",
            "space_url": "/v2/spaces/d2357fe4-a9f2-4c76-99d9-321b7668fde4",
            "service_plan_url": "/v2/service_plans/0855c61a-ef3a-4984-846f-b070a67cee76",
            "service_bindings_url": "/v2/service_instances/3cac0c98-28f6-446a-9a1d-552660213477/service_bindings",
            "service_keys_url": "/v2/service_instances/3cac0c98-28f6-446a-9a1d-552660213477/service_keys",
            "routes_url": "/v2/service_instances/3cac0c98-28f6-446a-9a1d-552660213477/routes",
            "service_url": "/v2/services/c9b03c64-5ef7-47ad-811f-f5502b69dbb1"
         }
      }
   ]
}

@Gerg
Copy link
Member

Gerg commented Dec 10, 2016

I investigated and the current implementation (raising an exception if > 10k characters) results in the following:

  1. User presented with error from Nick's comment above
  2. Delete request is issued to broker for orphan mitigation
  3. Service instance in CC is stranded without a last operation or backing service instance (reverse orphan)

It's interesting to note that we could get in this same state if the database deadlocked while persisting the service instance operation.


I'm not convinced that truncation is the correct strategy here. This field is meant to allow brokers to persist data in the CC, so that they don't have to maintain their own data store. Broker authors will likely serialize their data into this field, and truncating serialized JSON etc will definitely break brokers. This could cause service instances to get stuck in "In Progress" state while their brokers throw 4XX/5XXs due to the garbled data coming from CC (until the operation times out 1 week later).


If we want to keep something like the current implementation (fail fast), maybe the validation should live in response parser instead of the model, so it is more in line with the other broker response validations.

What do you think @avade?

@avade
Copy link
Author

avade commented Dec 12, 2016

I will defer this question to @jamesjoshuahill, way on the engineering side 👍

zrob added a commit that referenced this issue Dec 12, 2016
[GH #695]
[#131993979]

Signed-off-by: Greg Cobb <gcobb@pivotal.io>
Signed-off-by: Zach Robinson <zrobinson@pivotal.io>
@jamesjoshuahill
Copy link

Hi! 10k characters feels like a lot. For example, a typical on-demand service broker response would be: {"bosh_task_id":123456,"operation_type":"create"}. I think the column could be reduced from text to string, i.e. varchar(255).

I agree with @Gerg that truncation would be unhelpful. I suggest rejecting a response with operation data that exceeds the limit.

@avade
Copy link
Author

avade commented Dec 13, 2016

I think 255 is too small, given we don't know how large other brokers response is.

@SocalNick are you suggesting 10k characters or 10kb? 10kb would be 2560 characters if all chars were 4 bytes.

@SocalNick
Copy link
Contributor

@avade we did 10k characters. @zrob and @Gerg made two decisions / changes:

  1. For non-async, ignore the operation, it won't be used anyway
  2. If we receive more than 10k characters for the operation, it's considered a bad response and we enter our existing orphan mitigation strategy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants