Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add `dataset` extractor for `Meeseeks.Result` #1

Closed
mischov opened this issue Mar 30, 2017 · 6 comments

Comments

@mischov
Copy link
Owner

commented Mar 30, 2017

Add function Meeseeks.Result.dataset (and Meeseeks.dataset) inspired by HTMLElement.dataset.

iex> import Meeseeks.CSS
...
iex> html = "<div class=\"user\" data-id=\"123\" data-first-name=\"Hans\">Hans</div>"
...
iex> result = Meeseeks.all(html, css(".user"))
...
iex> Meeseeks.dataset(result)
%{"id" => "123", "first-name" => "Hans"} 

The suggestion was made by @Eiji7 for dataset to support value casting (ie, converting the above value of "id" from "123" to 123), but I think that this is beyond the scope of the library.

Implementation:

If one were to stick closely to the name mangling of HTMLElement.dataset, instead of creating:

 %{"id" => "123", "first-name" => "Hans"}

we would create:

%{"id" => "123", "firstName" => "Hans"}

or, if we were to stick to Elixir naming convention:

%{"id" => "123", "first_name" => "Hans"} 

Just sticking with dash (-) would require the least string manipulation, and is my preferred choice, but I am willing to be convinced otherwise.

@Eiji7

This comment has been minimized.

Copy link

commented Mar 31, 2017

@mischov:

The suggestion was made by @Eiji7 for dataset to support value casting (ie, converting the above value of "id" from "123" to 123), but I think that this is beyond the scope of the library.

I'm working on scraping project and I see that auto casting would be really helpful in my case.
So if we are not going to implement our methods (for atoms like :integer) then how about just allow to add own methods like:

Meeseeks.dataset(result, cast: %{"id" => &String.to_integer/1})

or, if we were to stick to Elixir naming convention:

If so then how about convert it to Atom, so we can simple fetch them like:

Meeseeks.dataset(result).id
@mischov

This comment has been minimized.

Copy link
Owner Author

commented Mar 31, 2017

@Eiji7

I'm working on scraping project and I see that auto casting would be really helpful in my case.
So if we are not going to implement our methods (for atoms like :integer) then how about just allow to add own methods:

Is there any benefit to combining dataset extraction and casting other than saving a little iteration?

Meeseeks.dataset(result)
|> cast(%{"id" => &String.to_integer/1})

If so then how about convert it to Atom, so we can simple fetch them

No go on that one- Atoms are not garbage collected so converting arbitrary/outside data from Strings to Atoms is generally not done.

@Eiji7

This comment has been minimized.

Copy link

commented Mar 31, 2017

@mischov:

Is there any benefit to combining dataset extraction and casting other than saving a little iteration?

ok, now I have a better plan for it (I can simply use Ecto for casting), nvm

@mischov

This comment has been minimized.

Copy link
Owner Author

commented Mar 31, 2017

@Eiji7

I can simply use Ecto for casting

Yeah, that was one reason I was suggesting to keep it separate.

No preference on whether I mangle the dataset key names ("data-first-name" to "first-name" or "firstName")?

@Eiji7

This comment has been minimized.

Copy link

commented Mar 31, 2017

@mischov: dataset method should not return data- part.
I think that you have 2 choices to decide about it to keep it simple to understand:

  1. as a CSS/HTML developer/designer the simplest way is just: first-name
  2. however any JavaScript developer (probably most of developers) would prefer firstName

I think that you should choose a JavaScript developer way, because developers looking for similar feature already know about original implementation and for other developers it will remain a curiosity.

@mischov

This comment has been minimized.

Copy link
Owner Author

commented Mar 31, 2017

@Eiji7
I agree.

Because the dataset is inspired by HTMLElement.dataset, I will keep name-mangling consistent with what users of HTMLElement.dataset expect.

Thank you for your feedback, I will move on to adding the feature now.

mischov added a commit that referenced this issue Apr 3, 2017
Add dataset extractor (#1)
  - Add `dataset` extractor that mimics the HTMLElement.dataset API

@mischov mischov closed this Apr 3, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.