Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add skelbiu.lt scraper #5

Open
zexa opened this issue Sep 17, 2021 · 2 comments
Open

Add skelbiu.lt scraper #5

zexa opened this issue Sep 17, 2021 · 2 comments

Comments

@zexa
Copy link
Owner

zexa commented Sep 17, 2021

No description provided.

@zexa
Copy link
Owner Author

zexa commented Oct 11, 2021

I've touched a part of the skelbiu-lt scraper as seen in #8.

In my mentioned PR you can find a roadmap of sorts in order to call skelbiu-lt version 1.0.0 ready, which I will paste below:

  • title
  • id
  • views
  • updated_at
  • liked_amount
  • description
  • location
  • quality
  • price
  • price_change
  • images
    • url
    • data
  • comments
    • author
    • content
    • created_at
  • author
    • image
      • url
      • data
    • created_at
    • contact_phone
    • contact_email
    • is_identity_confirmed
    • listing_amount

zexa added a commit that referenced this issue Oct 11, 2021
zexa added a commit that referenced this issue Oct 11, 2021
zexa added a commit that referenced this issue Oct 11, 2021
zexa added a commit that referenced this issue Oct 11, 2021
zexa added a commit that referenced this issue Oct 11, 2021
zexa added a commit that referenced this issue Oct 11, 2021
zexa added a commit that referenced this issue Oct 12, 2021
@zexa
Copy link
Owner Author

zexa commented Oct 13, 2021

One thing that I was thinking about while adding some of the fields is that you often won't want all the data that the scraper can get: i.e. image.data or even image or maybe image.url.

I'll probably end up introducing a concept that allows you to specify which fields are desirable and which fields are not.

Probably in the shape of a Vec<String> that's provided via scrape() and scrape_listing() methods.

I.e. vec!["title", "description", "price"] if you want only the name, description and price.

We will probably add a ScrapingOption<T> struct to communicate which fields were desired and found, which fields were desired but not found and which fields were not desired.

While serializing these fields we will exclude fields that were not desired and show null for those that were.

For now, we will scrape all the fields that we can think of and add all the possible data to a listing.

zexa added a commit that referenced this issue Oct 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant