New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Also) parsing structured data while you're at it #2
Comments
|
@westurner can you say a bit more about the motivations/applications here? |
How useful is a trained synthetic language model (with 'transformers' in this case) without reading comprehension? I think maybe people are expecting more out of this approach (from Google, OpenAI) to NLP than _______. Can these models learn and do reasoning and inference (and synthesis of not rehashed but new ideas) with lots of noisy information? If so, extracting reusable, shareable structured data for more energy efficient narrow ml applications is a most useful task. More structured data from all of that noise would be great; might it be more efficient to extract structured data from HTML that's already paged into RAM instead of as a separate pass. Perhaps an ironic gesture of opportunism |
Deleted the duplicate reply. I think this goes beyond the scope of a replication but might be something to look into after the main goals have been reached. Otherwise, a fork is also an option. |
I had thought that this thread held a reference to state of the art language comprehension metrics, but was wrong. I don't remember where that is; though this is a fantabulous resource regarding the topic: "Better Language Models and Their Implications" https://news.ycombinator.com/item?id=19163522
|
... from chiphuyen/lazynlp#1
The text was updated successfully, but these errors were encountered: