Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sitemap_to_df cannot handle multiple images #108

Closed
naidu-rohit opened this issue Oct 17, 2020 · 3 comments
Closed

sitemap_to_df cannot handle multiple images #108

naidu-rohit opened this issue Oct 17, 2020 · 3 comments

Comments

@naidu-rohit
Copy link

sitemap_to_df only returns the first image loc in the sitemap. If there are multiple images it ignores them. example https://www.levi.in/sitemap_0.xml

How can we make this work for multiple images?

@eliasdabbas
Copy link
Owner

True. This is exactly like #87 the issue of multiple video tags when available, as well as hreflang (not implemented yet).
I'm still considering different ways of doing this because you will have multiple values for each row in the returned DataFrame.
For now, it seems like multiple values will be delimited by @@ like in the crawl function, and be placed in the same cell.
Let me know if you have other suggestions.

@naidu-rohit
Copy link
Author

I was thinking we can keep images and videos in a separate dataframe with loc as key. So we can have multiple images for each loc.

@eliasdabbas
Copy link
Owner

Similar to #87, to be handled for all types of multiple items in sitemaps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants