Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Stack Overflow document source #57

Open
sylvinus opened this issue Aug 13, 2016 · 1 comment
Open

Add Stack Overflow document source #57

sylvinus opened this issue Aug 13, 2016 · 1 comment

Comments

@sylvinus
Copy link
Contributor

Dumps seem to be available at https://archive.org/details/stackexchange

@wumpus
Copy link

wumpus commented Oct 18, 2016

This is a good idea, you can do a much better job indexing Stack Exchange sites from the data dump. As an example, the tags (like "Python", "Ruby" etc) are only sometimes in the question title, but people searching stack overflow frequently put a language in their query. Yeah, you can find the tags in the html somewhere, but it's probably easier to use the data dumps directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants