Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace url parse by url class #1147

Merged
merged 1 commit into from
Oct 6, 2020

Conversation

seriousme
Copy link
Contributor

This fixes #1130 by replacing url.parse by the URL class as advised by the Node documentation on the subject:

Use of the legacy url.parse() method is discouraged. Users should use the WHATWG URL API. Because the url.parse() method 
uses a lenient, non-standard algorithm for parsing URL strings, security issues can be introduced. Specifically, issues with host 
name spoofing and incorrect handling of usernames and passwords have been identified.

Kind regards,
Hans

@seriousme
Copy link
Contributor Author

Ps. the failure on 10.x is due to some issue with the websocket implementation and has imho no relation to this PR.
(master branch also has this failure if I run it through CI)

@seriousme seriousme force-pushed the replace-url-parse-by-url-class branch from afc3930 to ac98686 Compare September 3, 2020 11:02
@seriousme
Copy link
Contributor Author

I have rebased to current master and resolved the merge conflict. However master still fails on node v10.

@seriousme
Copy link
Contributor Author

@YoDaMa can you please review ?
I'd rather not have to rebase again because my PR fell off the desk ;-)

@qm3ster
Copy link
Contributor

qm3ster commented Sep 29, 2020

What did you mean by

the URL object is a bit special

can we use spread syntax? Or do we actually have to use destructuring?

@YoDaMa
Copy link
Contributor

YoDaMa commented Sep 29, 2020

@seriousme we need to do a rollback... so you'll have to rebase once more :) also there's a conflict in ws.js... one moment.

@seriousme
Copy link
Contributor Author

What did you mean by

the URL object is a bit special

can we use spread syntax? Or do we actually have to use destructuring?

You can just do an Object.assign on the URL object,e.g.:

> u=new URL("https://www.example.com:8080/")
URL {
  href: 'https://www.example.com:8080/',
  origin: 'https://www.example.com:8080',
  protocol: 'https:',
  username: '',
  password: '',
  host: 'www.example.com:8080',
  hostname: 'www.example.com',
  port: '8080',
  pathname: '/',
  search: '',
  searchParams: URLSearchParams {},
  hash: '' }


> a= Object.assign({},u)
{ [Symbol(context)]:
   URLContext {
     flags: 400,
     scheme: 'https:',
     username: '',
     password: '',
     host: 'www.example.com',
     port: 8080,
     path: [ '' ],
     query: null,
     fragment: null },
  [Symbol(query)]: URLSearchParams {} }

And on a more theoretical note: you should not copy the internal structure of a class but only use its accessors ;-)

Kind regards,
Hans

@seriousme
Copy link
Contributor Author

@seriousme we need to do a rollback... so you'll have to rebase once more :) also there's a conflict in ws.js... one moment.

Bit of a bummer :-(
But if you can leave a comment once you feel its safe to rebase I will do so. Hopefully for the last time ;-)

Kind regards,
Hans

@qm3ster
Copy link
Contributor

qm3ster commented Oct 1, 2020

Maybe it would be prudent to look at how the parsed url is actually utilized.
Perhaps we don't need all these properties, especially since we're putting them all on the root options object.

@seriousme
Copy link
Contributor Author

Maybe it would be prudent to look at how the parsed url is actually utilized.
Perhaps we don't need all these properties, especially since we're putting them all on the root options object.

That would be a good idea, however the opts object is scattered around the code base. If you really want to go that route then I'd advise a more serious refactoring operation.

Kind regards,
Hans

opts.port = Number(parsed.port) || null
opts.username = parsed.username
opts.password = parsed.password
opts.searchParams = parsed.searchParams
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure this is all the info we could possibly need from the URL? I'm worried something might be missed without just the extend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure we don't need more. And the tests agreed with me on that ;-)

This is what url.parse returns:

> url.parse("https://user:password@www.example.com:8080/?a=b")
Url {
  protocol: 'https:',
  slashes: true,
  auth: 'user:password',
  host: 'www.example.com:8080',
  port: '8080',
  hostname: 'www.example.com',
  hash: null,
  search: '?a=b',
  query: 'a=b',
  pathname: '/',
  path: '/?a=b',
  href: 'https://user:password@www.example.com:8080/?a=b' 

Kind regards,
Hans

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: @nosovk already approved on September 2 so its not just me ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should actually be opts.host = parsed.hostname
This project uses host as the name for hostname (just domain, without port)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the 11 locations where opts.hostname is being used ;-)
https://github.com/mqttjs/MQTT.js/search?q=opts.hostname

and currently (without my PR) the results from url.parse are copied into opts.

So currently:

The comments in type definitions seem to be a bit ahead of the code ;-)

But anyway:

For me its just a hobby, and I saw the issue and thought I'd help out.
If you all feel it can be done way better than I did, feel free!
(its only a few lines of code and from my PR you know where to look ;-))
That would save me the effort of rebasing (again) as well.

If you still want me to rebase because you are all way too busy, just let me know when its safe to do so I don't need to rebase a third time ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

appreciate your contributions just doing due diligence :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in that case: #1148 is also still open ;-)
It would convenient for me if that could be either merged or fixed in some other way better way :-)
(e.g. generate fresh certs if not available etc )

@qm3ster
Copy link
Contributor

qm3ster commented Oct 2, 2020

This is approximately what opts involves in different places, I haven't looked at ws-specific things yet:

// set/used in connect/index.js
interface Opts {
  protocol: string; //protocol without ":"
  hostname: string; //actually `host` (doesn't include port)
  port: string | number; //sometimes parsed as Number
  path: string; //url path, like /a/b
  // currently missing: search: string, // ?a=a&b=b
  query?: { clientId?: string }; // comes from url
  cert: unknown;
  key: unknown;
  clean?: boolean; // default = true
  servers: { host: string; port: string | number; protocol?: Protocol }[];
}

// in client.js
var defaultConnectOptions = {
  keepalive: 60,
  reschedulePings: true,
  protocolId: "MQTT",
  protocolVersion: 4,
  reconnectPeriod: 1000,
  connectTimeout: 30 * 1000,
  clean: true,
  resubscribe: true,
};
// debug printed
interface Options {
  protocol: string;
  protocolVersion: 4 | 5;
  username: string;
  keepalive: number;
  reconnectPeriod: number; // default 30s
  rejectUnauthorized: unknown;
}
// other
interface Options {
  clientId: string;
  customHandleAcks: Function;
  outgoingStore: Store;
  incomingStore: Store;
  queueQoSZero: boolean; // default = true
}
// this.options.customHandleAcks =
//     (options.protocolVersion === 5 && options.customHandleAcks)
//     ? options.customHandleAcks
//     : function () { arguments[3](0) }

// passed to mqtt-packet/parser.js
interface ParserOpts {
  protocolVersion: 4 | 5;
}
type Properties = unknown[] | Record<any,unknown>;
// passed to mqtt-packet/writeToStream.js#connect
// this packet has options set to it as prototype, so some options will be available that way
interface ConnectPacket {
  protocolId: "MQTT" | unknown,
  protocolVersion?: 3 | 4 | 5, //default 4
  will: {
    topic: string,
    payload: string | Buffer,
    retain?: boolean,
    qos?: QoS,
    properties: Properties, //only if mqtt5
  },
  clean: boolean,
  keepalive?: number, //default 0
  clientId?: string, //default ''
  username: string,
  password: string,
  properties: Properties, //only if mqtt5
}
interface PacketOpts {
  protocolVersion: 3 | 4 | 5; //default 4
}

@YoDaMa
Copy link
Contributor

YoDaMa commented Oct 5, 2020

ok @seriousme can you rebase and figure out what the conflict is with ws.js?

@seriousme seriousme force-pushed the replace-url-parse-by-url-class branch from ac98686 to 6aef40f Compare October 6, 2020 09:27
@seriousme
Copy link
Contributor Author

Ok, I have rebased and squashed into a single commit. The 10.x fail looks unrelated to this PR and the 12.x and 14.x succeed.

Kind regards,
Hans

@YoDaMa YoDaMa self-requested a review October 6, 2020 16:39
Copy link
Contributor

@YoDaMa YoDaMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Connection problem with IPv6 addresses
4 participants