Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add http/user_agent #3387

Merged
merged 3 commits into from
Jul 4, 2023
Merged

feat: add http/user_agent #3387

merged 3 commits into from
Jul 4, 2023

Conversation

kitsonk
Copy link
Contributor

@kitsonk kitsonk commented May 17, 2023

This feature adds user agent parsing to std/http. It is heavily inspired by ua-parser-js and is able to break down a user agent string into browser, OS, CPU, device, and browser engine.

For example to respond with a request with some information extracted from the user agent:

import { UserAgent } from "https://deno.land/std@$STD_VERSION/http/user_agent.ts";
import { serve } from "https://deno.land/std@$STD_VERSION/http/server.ts";

serve((req) => {
  const userAgent = new UserAgent(req.headers.get("user-agent") ?? "");
  return new Response(`Hello, ${userAgent.browser.name}
    on ${userAgent.os.name} ${userAgent.os.version}!`);
});

@kitsonk kitsonk requested a review from kt3k as a code owner May 17, 2023 11:57
@timreichen
Copy link
Contributor

I was wondering if a class based approach is really necessary or if it should be done via a parse (and stringify) function instead, since there is no internal state and just getters and no setters.

@kitsonk
Copy link
Contributor Author

kitsonk commented May 17, 2023

😕 There is internal state. Each property is lazily calculated and stored in the instance.

In addition, there are special toString() and toJSON() methods as well as custom inspection methods to encapsulate the concept of "user agent".

Also the class based approach aligns to other "http" things like Headers, URL, URLSearchParams, Request, Response, etc.

@lionel-rowe
Copy link
Contributor

There's also the option of using a module-private class as the implementation but only exporting a wrapper function that returns an instance of the class:

class UserAgent {
  // ...
}

export function parse(ua: string | null) {
  return new UserAgent(ua);
}

Has the advantage of slightly more flexibility:

[ua1, ua2, ua3].map(parse); // works
[ua1, ua2, ua3].map(UserAgent); // Class constructor UserAgent cannot be invoked without 'new'

Also the class based approach aligns to other "http" things like Headers, URL, URLSearchParams, Request, Response, etc.

All of those are part of the web platform, though, so they're not necessarily comparable to stuff in the std libary.

@kitsonk
Copy link
Contributor Author

kitsonk commented May 19, 2023

There's also the option of using a module-private class as the implementation but only exporting a wrapper function that returns an instance of the class:

I had originally written it that way, but it was a non-sensical, because it was arbitrary abstraction with no value, and it is technically a lie, because parse() doesn't parse.

Has the advantage of slightly more flexibility:

[ua1, ua2, ua3].map(parse); // works
[ua1, ua2, ua3].map(UserAgent); // Class constructor UserAgent cannot be invoked without 'new'

Very very trivially...

[ua1, ua2, ua3].map((ua) => new UserAgent(ua));

If there was more to create an instance than call the constructor, I would be inclined to add it.

Also the class based approach aligns to other "http" things like Headers, URL, URLSearchParams, Request, Response, etc.

All of those are part of the web platform, though, so they're not necessarily comparable to stuff in the std libary.

0k, then CookieMap, SecureCookieMap, and ServerSentEventStreamTarget then which are all std/http.

@timreichen
Copy link
Contributor

All of those are part of the web platform, though, so they're not necessarily comparable to stuff in the std libary.

0k, then CookieMap, SecureCookieMap, and ServerSentEventStreamTarget then which are all std/http.

I think the difference is that all other classes in http have an internal state that is changeable via property or methods, or extend an existing class.
The UserAgent class is different in that there are only private properties and getters.
Methods like toString, toJSON and inspect symbols are only there because it is a class in the first place.

What is the added value for UserAgent being a class instead of a plain object with lazy load getters?

function parse(ua: string) {
  let browser: Browser
  let cpu: Cpu
  let device: Device
  let engine: Engine
  let os: Os

  return {
    get browser(): Browser {
      if (!browser) {
        browser = {
          name: undefined,
          version: undefined,
          major: undefined,
        };
        mapper(browser, ua, matchers.browser);
        // deno-lint-ignore no-explicit-any
        (browser as any).major = majorize(browser.version);
        Object.freeze(browser);
      }
      return browser;
    },

    /** The architecture of the CPU extracted from the user agent string. */
    get cpu(): Cpu {
      if (!cpu) {
        cpu = { architecture: undefined };
        mapper(cpu, ua, matchers.cpu);
        Object.freeze(cpu);
      }
      return cpu;
    },

    /** The model, type, and vendor of a device if present in a user agent
     * string. */
    get device(): Device {
      if (!device) {
        device = { model: undefined, type: undefined, vendor: undefined };
        mapper(device, ua, matchers.device);
        Object.freeze(device);
      }
      return device;
    },

    /** The name and version of the browser engine in a user agent string. */
    get engine(): Engine {
      if (!engine) {
        engine = { name: undefined, version: undefined };
        mapper(engine, ua, matchers.engine);
        Object.freeze(engine);
      }
      return engine;
    },

    /** The name and version of the operating system in a user agent string. */
    get os(): Os {
      if (!os) {
        os = { name: undefined, version: undefined };
        mapper(os, ua, matchers.os);
        Object.freeze(os);
      }
      return os;
    },
  };

}

Null input

Why is the input be allowed to be null? Seems weird to me, since it replaces it with and empty string. Imo it should only allow string and everything else should be done outside the module by the user.

@kitsonk
Copy link
Contributor Author

kitsonk commented May 19, 2023

Methods like toString, toJSON and inspect symbols are only there because it is a class in the first place.

No, getter do not log. So a plain object with getters does not log to the console well.

What is the added value for UserAgent being a class instead of a plain object with lazy load getters?

What is the added value of it being a plan object with lazy load getters? Actually lazy load getters becomes more complex because you have to store the parsed value somewhere associated with the instance. You don't want to reparse on every property access. 😦

Why is the input be allowed to be null?

const ua = new UserAgent(request.headers.get("user-agent"));

@lino-levan
Copy link
Contributor

I guess there's some meta-commentary to be had around "should we even encourage user agent sniffing by adding this module to std?"

The actual code and interface makes sense to me, just that lingering issue I guess.

@kitsonk
Copy link
Contributor Author

kitsonk commented May 26, 2023

I guess there's some meta-commentary to be had around "should we even encourage user agent sniffing by adding this module to std?"

It would be incorrect to assume that the only purpose for parsing the user agent is to do "sniffing" instead of client side feature detection. There are a lot of "legitimate" reasons to parse the user agent, like reporting and analytics, improving user experience by defaulting a download to a specific platform/os build, understanding if the client is an embedded system/tablet/mobile device irrespective of device geometry, etc.

@timreichen
Copy link
Contributor

Why is the input be allowed to be null?

const ua = new UserAgent(request.headers.get("user-agent"));

I get that this makes it a bit more convenient in your example, but passing null to UserAgent makes no sense imo.

const ua = new UserAgent(null) // this is valid... 

Instead the user could simply check if user-agent is present and an instance should be created in the first place or pass an empty string by themselves.

const ua = request.headers.has("user-agent") ? new UserAgent(request.headers.get("user-agent")) : null;

or

const ua = new UserAgent(request.headers.get("user-agent") ?? "") 

This seems cleaner and more logical imo.

@kitsonk
Copy link
Contributor Author

kitsonk commented May 27, 2023

I get that this makes it a bit more convenient in your example, but passing null to UserAgent makes no sense imo.

Well it does imo. We will have to agree to disagree. ?? "" is non-sensical too. The API should cater to the most common use case, which would be request.headers.get("user-agent") and not make the user go through additional hoops imo.

http/user_agent.ts Outdated Show resolved Hide resolved
@lionel-rowe
Copy link
Contributor

We will have to agree to disagree. ?? "" is non-sensical too

That seems like an argument for making it more strict, not less — maybe it should throw on empty-string inputs. I don't see the use in deliberately allowing an API to represent nonsensical states — that's just inviting API consumers to write buggy code.

The API should cater to the most common use case, which would be request.headers.get("user-agent")

Even aside from the fact there are other plausible use cases (client-side parsing of navigator.userAgent, parsing analytics data from logs, etc), plus the fact that back-end frameworks often have their own ways of dealing with headers that might return other values besides string | null, I still don't really follow this logic. Should a mime-type parsing library also accept null inputs because it's frequently used with Accept/Content-Type? Or a JWT library should accept nulls because JWTs are most often used with Authorization headers?

@kitsonk
Copy link
Contributor Author

kitsonk commented May 28, 2023

That seems like an argument for making it more strict, not less — maybe it should throw on empty-string inputs. I don't see the use in deliberately allowing an API to represent nonsensical states — that's just inviting API consumers to write buggy code.

It throwing would invalidate its most common use case... When processing a request and creating a "context" associated with that request, UserAgent should fail friendly, not throw. So if the request does not contain a User-Agent header, it will pass a null, and the UserAgent instance will properly present that individual properties are undefined.

Should a mime-type parsing library also accept null inputs because it's frequently used with Accept/Content-Type?

Yes.

@lino-levan
Copy link
Contributor

Just wanted to voice my opinion here. I have to agree with @kitsonk on what the functionality should be. Seems like it is a lot more practical for the average use case.

Copy link
Member

@kt3k kt3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in review. LGTM

#3387 (comment)

There are a lot of "legitimate" reasons to parse the user agent, like reporting and analytics, improving user experience by defaulting a download to a specific platform/os build, understanding if the client is an embedded system/tablet/mobile device irrespective of device geometry, etc.

I think this comment describes the role/purpose of this module well, and I'm in favor of landing this.

@kt3k kt3k merged commit 7cdd55a into denoland:main Jul 4, 2023
8 checks passed
@kitsonk kitsonk deleted the feat_user_agent branch July 6, 2023 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants