Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EMR Service does not have a usable "Resource" #1189

Closed
bmhatfield opened this issue May 20, 2016 · 4 comments
Closed

EMR Service does not have a usable "Resource" #1189

bmhatfield opened this issue May 20, 2016 · 4 comments
Labels
feature-request A feature should be added or improved. service-api General API label for AWS Services.

Comments

@bmhatfield
Copy link

Hi,

We're currently working on switching a service we have using aws-sdk-v1 to aws-sdk (v2).

One thing we quickly noticed is that the EMR client in v2 does not have a OO / expressive Resource type, forcing us to use the Client type. This experience is much worse than v1. For example, trying to determine if a job flow existed in v1 allowed us to do something like:

flow = emr.job_flows[id] and then flow.exists?. In v2, we must do emr.describe_cluster({:cluster_id => id}), which will in the "doesn't exist" case will then explode with a generic exception class Aws::EMR::Errors::InvalidRequestException with the string Cluster id 'j-[IDNUMBERHERE]' is not valid. Since InvalidRequestException could mean many things, we're must then either contextually assume our flow doesn't exist, which seems like a bad idea, or string match on the exception string, which is even worse.

Also, the client documentation for EMR in aws-sdk V2 is confusing or misleading. For example if we do run_job_flow, that will return a RunJobFlowOutput object, which only has one method, job_flow_id, which then returns a string. describe_cluster however is documented as taking a cluster_id, which while we were able to eventually figure out that they are the same thing, nothing in the documentation indicated as such.

@awood45
Copy link
Member

awood45 commented May 20, 2016

I think part of this issue comes from the fact that the EMR API has changed as well - #describe_job_flows is now deprecated with intent to be removed: doc link. A straight port of #exists? from V1 wouldn't work in the same way because we're talking about different client methods.

@awood45 awood45 added feature-request A feature should be added or improved. Version 2 service-api General API label for AWS Services. labels May 20, 2016
@awood45
Copy link
Member

awood45 commented May 20, 2016

Let's work through the current state, especially since there is a forced migration from #describe_job_flows to #describe_cluster - so even a future higher level abstraction needs to handle this behavior. Let's say I make a describe cluster request for a cluster that obviously does not exist:

emr = Aws::EMR::Client.new(http_wire_trace: true)
emr.describe_cluster(cluster_id: "j-1foobar")

The body of the 400 Bad Request response I get from the service (if pretty-printed) is:

{
  "__type": "InvalidRequestException",
  "ErrorCode": "MalformedClusterId",
  "Message": "Cluster id 'j-1foobar' is not valid."
}

I want to emphasize, again, that V1 of the SDK is going to have this same issue as #describe_job_flows gets deprecated (I can't even run it on my test account anymore to compare). We're building the exception type first from the __type value, and second from the ErrorCode value. This is the cross-service standard. We calculate the code in the same way. Unfortunately in this case, it clobbers the more descriptive ErrorCode value in the error response body, which is difficult to fix in a non-breaking way. In other words, even a higher level abstraction would probably have to resolve the situation in the same way you did, at least for the moment. Though something in the exception class that exposes the raw ErrorCode from the response seems like it could be helpful in cases like these without being a breaking change.

@awood45
Copy link
Member

awood45 commented May 20, 2016

Messing around, you can actually expose this error. It's not the cleanest code (and not tested in detail), but while we figure out the best way to expose this, it illustrates how you could more reliably pull out the error code from the response body:

def exists?(cluster_id)
  begin
    emr.describe_cluster(cluster_id: cluster_id)
    true
  rescue Aws::EMR::Errors::InvalidRequestException => e
    if JSON.parse(e.context.http_response.body_contents)["ErrorCode"] == "MalformedClusterId"
      false
    else
      raise e
    end
  end
end

@awood45
Copy link
Member

awood45 commented Jun 3, 2016

Added to feature request backlog.

@awood45 awood45 closed this as completed Jun 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. service-api General API label for AWS Services.
Projects
None yet
Development

No branches or pull requests

2 participants