Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upAdd Extract unique Hashtag example. Fixes #242 #247
Conversation
budziq
requested changes
Jul 16, 2017
|
|
||
| The hashtag regex given here only catches latin hashtags that start with a letter. The [twitter hashtag regex] is way more complicated. | ||
|
|
||
| [twitter hashtag regex]: https://github.com/twitter/twitter-text/blob/master/java/src/com/twitter/Regex.java#L255 |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 16, 2017
Collaborator
please move the link source to reference section at bottom of the file
|
|
||
| Extracts a sorted and deduplicated list of hashtags from a text. | ||
|
|
||
| The hashtag regex given here only catches latin hashtags that start with a letter. The [twitter hashtag regex] is way more complicated. |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 16, 2017
Collaborator
I would add some adjective here
The [twitter hashtag regex] is way more complicated.
Like:
The complete [twitter hashtag regex] is way more complicated.
|
|
||
| ```rust | ||
| extern crate regex; | ||
| #[macro_use] extern crate lazy_static; |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 16, 2017
Collaborator
if using lazy_static please add its tag next to regex in both TOC's and example header
Also some description of what is going on here with lazy_static would be needed in the first (order of appearance) regex+lazy_static example
| r"\#[a-zA-Z][0-9a-zA-Z_]*" | ||
| ).unwrap(); | ||
| } | ||
| let mut results = vec![]; |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 16, 2017
Collaborator
why not just map and collect into HashSet or BTreeSet (if you need sorted results)?
Phaiax
force-pushed the
Phaiax:ex-hashtag-242
branch
from
711b579
to
c5793b1
Jul 22, 2017
budziq
requested changes
Jul 22, 2017
|
Nearly perfect! just few more touches. Sorry the review slipped past me for quite some time |
| @@ -535,3 +576,4 @@ fn main() { | |||
| <!-- Reference --> | |||
|
|
|||
| [race-condition-file]: https://en.wikipedia.org/wiki/Race_condition#File_systems | |||
| [twitter-hashtag-regex]: https://github.com/twitter/twitter-text/blob/master/java/src/com/twitter/Regex.java#L255 | |||
This comment has been minimized.
This comment has been minimized.
budziq
Jul 22, 2017
Collaborator
you can have the link text [twitter hashtag regex] without hyphens to avoid link remapping.
also linking to master branch will likely make the reference evaporate. I would suggest linking to specific git revision.
|
|
||
| Extracts a sorted and deduplicated list of hashtags from a text. | ||
|
|
||
| The hashtag regex given here only catches latin hashtags that start with a letter. The complete [twitter hashtag regex][twitter-hashtag-regex] is way more complicated. |
This comment has been minimized.
This comment has been minimized.
| fn main() { | ||
| let tweet = "Hey #world, I just got my new #dog, say hello to Till. #dog #forever #2 #_ "; | ||
| let tags = extract_hashtags(tweet); | ||
| assert!(tags.contains("#dog")); |
This comment has been minimized.
This comment has been minimized.
budziq
Jul 22, 2017
Collaborator
could you assert on the whole contents to show that its deduplicated?
| assert!(tags.contains("#dog")); | ||
| // Optional: Convert &str to String | ||
| let _tags : HashSet<String> = tags.iter().map(|&s| s.to_owned()).collect(); |
This comment has been minimized.
This comment has been minimized.
Phaiax
force-pushed the
Phaiax:ex-hashtag-242
branch
from
c5793b1
to
8b529d2
Jul 23, 2017
Phaiax
changed the title
Add Extract unique Hashtag example
Add Extract unique Hashtag example. Fixes #242
Jul 23, 2017
This comment has been minimized.
This comment has been minimized.
|
I rebased and merged manually. Thanks @Phaiax |
Phaiax commentedJul 15, 2017
#242