Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Digest authentication thowing errors in Jena 4.3.X and newer #1318

Closed
ebremer opened this issue May 17, 2022 · 4 comments · Fixed by #1319
Closed

Digest authentication thowing errors in Jena 4.3.X and newer #1318

ebremer opened this issue May 17, 2022 · 4 comments · Fixed by #1319
Assignees

Comments

@ebremer
Copy link
Contributor

ebremer commented May 17, 2022

Query query = QueryFactory.create("select ?s where {?s ?p ?o} limit 10");
AuthEnv.get().registerUsernamePassword(new URI("https://myserver.edu/sparql-auth"), Settings.user, Settings.password);
HttpClient client = HttpClient.newBuilder()
                .connectTimeout(Duration.ofSeconds(10))
                .version(HttpClient.Version.HTTP_1_1)
                .build();
RDFConnection con = RDFConnectionRemote.create()
                .destination(host)
                .queryEndpoint("sparql-auth")
                .httpClient(client)
                .build();
QueryExecution qe = con.query(query);
ResultSet results = qe.execSelect();

is throwing an error:

Exception in thread "main" java.lang.IllegalArgumentException: invalid header value: "Digest username="username", realm="SPARQL", nonce="b7ef0b701772c22b88db561ac58bd55f", uri="/sparql-auth?query=SELECT  ?s
WHERE
  { ?s  ?p  ?o }
LIMIT   10
", qop=auth, cnonce="D8DA17AFEAD93444", nc=00000000, response="4d8ba8cef8aaa7f22d1adf811c6ac903", opaque="5ebe2294ecd0e0f08eab7690d2a6ee69""
at java.net.http/jdk.internal.net.http.common.Utils.newIAE(Utils.java:286)
at java.net.http/jdk.internal.net.http.HttpRequestBuilderImpl.checkNameAndValue(HttpRequestBuilderImpl.java:113)
at java.net.http/jdk.internal.net.http.HttpRequestBuilderImpl.setHeader(HttpRequestBuilderImpl.java:119)
at java.net.http/jdk.internal.net.http.HttpRequestBuilderImpl.setHeader(HttpRequestBuilderImpl.java:43)
at org.apache.jena.http.auth.DigestLib.lambda$buildDigest$0(DigestLib.java:119)
at org.apache.jena.http.auth.AuthLib.handle401(AuthLib.java:124)
at org.apache.jena.http.auth.AuthLib.authExecute(AuthLib.java:54)
at org.apache.jena.http.HttpLib.execute(HttpLib.java:536)
at org.apache.jena.http.HttpLib.execute(HttpLib.java:493)
at org.apache.jena.sparql.exec.http.QueryExecHTTP.executeQuery(QueryExecHTTP.java:497)
at org.apache.jena.sparql.exec.http.QueryExecHTTP.performQuery(QueryExecHTTP.java:471)
at org.apache.jena.sparql.exec.http.QueryExecHTTP.execRowSet(QueryExecHTTP.java:168)
at org.apache.jena.sparql.exec.http.QueryExecHTTP.select(QueryExecHTTP.java:160)
at org.apache.jena.sparql.exec.QueryExecutionAdapter.execSelect(QueryExecutionAdapter.java:117)
at org.apache.jena.sparql.exec.QueryExecutionCompat.execSelect(QueryExecutionCompat.java:97)
at com.mycompany.rad.RDF.<init>(RDF.java:90)
at com.mycompany.rad.RDF.main(RDF.java:105)

changing

QueryExecution qe = con.query(query);
to
QueryExecution qe = con.query("select ?s where {?s ?p ?o} limit 10");

and it works.

My environment is:

java -version
openjdk version "17.0.3" 2022-04-19
OpenJDK Runtime Environment GraalVM CE 22.1.0 (build 17.0.3+7-jvmci-22.1-b06)
OpenJDK 64-Bit Server VM GraalVM CE 22.1.0 (build 17.0.3+7-jvmci-22.1-b06, mixed mode, sharing)

Perhaps SPARQL string isn't being uuencoded properly?

@rvesse
Copy link
Member

rvesse commented May 17, 2022

So the HTTP Authentication Challenge is coming back from whatever SPARQL server you are interacting with which you haven't specified.

I'm unclear whether the individual parameter fields within the WWW-Authenticate header are permitted to have whitespace or not.

Following the grammar an auth-param rule says Bad Whitespace may be present per RFC 7230 Section 3.2.3:

auth-param = token BWS "=" BWS ( token / quoted-string )

The BWS rule is used where the grammar allows optional whitespace
only for historical reasons. A sender MUST NOT generate BWS in
messages. A recipient MUST parse for such bad whitespace and remove
it before interpreting the protocol element.

 OWS            = *( SP / HTAB )
                ; optional whitespace
 RWS            = 1*( SP / HTAB )
                ; required whitespace
 BWS            = OWS
                ; "bad" whitespace

The upshot of which is servers MUST NOT be producing whitespace here BUT recipients i.e. Jena, should strip and ignore such bad whitespace.

Although that seems to be between parameters not within the parameters itself, the relevant grammar rule actually looks to be quoted-string (RFC 7230 Section 3.2.6) which says:

A string of text is parsed as a single value if it is quoted using
double-quote marks.

 quoted-string  = DQUOTE *( qdtext / quoted-pair ) DQUOTE
 qdtext         = HTAB / SP /%x21 / %x23-5B / %x5D-7E / obs-text
 obs-text       = %x80-FF

Comments can be included in some HTTP header fields by surrounding
the comment text with parentheses. Comments are only allowed in
fields containing "comment" as part of their field value definition.

 comment        = "(" *( ctext / quoted-pair / comment ) ")"
 ctext          = HTAB / SP / %x21-27 / %x2A-5B / %x5D-7E / obs-text

The backslash octet ("") can be used as a single-octet quoting
mechanism within quoted-string and comment constructs. Recipients
that process the value of a quoted-string MUST handle a quoted-pair
as if it were replaced by the octet following the backslash.

Which boils down to new lines not being valid in these parameters AFAICT. So this looks like probably a server bug but maybe an area where Jena could be more forgiving?

@afs
Copy link
Member

afs commented May 17, 2022

The problem is in the outgoing credentials request after the 401 challenge.

It is the SPARQL string not being encoded and it has a newline in it; the JDK rejects the request locally. The server has only sent the 401 challenge which is legal and isn't involved in the exception cause at Utils.newIAE.

The "uri" auth param in a digest is the "request-target". That includes the query string by my reading of RFC7616->RFC7230 section 5.5

But that does not make sense in our situation

  • "?query=" and POST+application/sparql-results should be the same authentication. It is not like /path?client=1234 where the query string is a subresource.
  • The auth support is by URI path and realm, not query string. The users/password applies to the endpoint.

So for the SPARQL case, a request-target of the URI path make sense to me (no query). This work with Jetty. This is not surprising because the Jetty resource is named by the URI without query string.

Putting in the encoded query also works with Jetty HttpLib.requestTarget - use getRawQuery not getQuery.

@afs afs self-assigned this May 17, 2022
@afs
Copy link
Member

afs commented May 17, 2022

Extra oddity: as query strings for SPARQL can be long, there is a chance the HTTP request will exceed the server limit on request headers - the URI will be in the headers twice (request line and "Authentication" header). The total limit is often 8K but some nginx is 4K.

@ebremer
Copy link
Contributor Author

ebremer commented May 17, 2022

FYI - The server I am executing this against is Virtuoso.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants