Skip to content

Commit

Permalink
Update TODO list
Browse files Browse the repository at this point in the history
  • Loading branch information
mremond committed Jan 14, 2019
1 parent 3ef7c24 commit e3a26d9
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 1 deletion.
2 changes: 2 additions & 0 deletions TODO.md
@@ -1,6 +1,8 @@
# TODO

- Improve profile crawler by marking crossreferenced profiles as "certified".
- [CRAWLER] Add ability to filter on expected content type (defaults to "text/html", as we are building primarily a
HTML tool.
- Resolve twitter short url inside embedded tweets.
- Prerender Youtube links
It should work by embedding content in a way that avoid tracking. We cannot just embed Youtube video snippet.
Expand Down
1 change: 0 additions & 1 deletion pkg/semweb/crawler.go
Expand Up @@ -60,7 +60,6 @@ func (c *Crawler) enqueue(url string) {

// processURL retrieves a give URL and pass it to the features extractor.
// TODO:
// - Skip urls that were already checked.
// - Store url and their canonical URLs ? check how to best handle canonical url
func (c *Crawler) processURL(url string) {
defer c.wg.Done()
Expand Down

0 comments on commit e3a26d9

Please sign in to comment.