Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Extract unique Hashtag example. Fixes #242 #247

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
3 participants
@Phaiax
Copy link
Contributor

Phaiax commented Jul 15, 2017

@budziq
Copy link
Collaborator

budziq left a comment

Hi @Phaiax
The example looks really nice 👍 just some suggestions below to help the next reviewer.

some of the comments here are applicable also to #248


The hashtag regex given here only catches latin hashtags that start with a letter. The [twitter hashtag regex] is way more complicated.

[twitter hashtag regex]: https://github.com/twitter/twitter-text/blob/master/java/src/com/twitter/Regex.java#L255

This comment has been minimized.

@budziq

budziq Jul 16, 2017

Collaborator

please move the link source to reference section at bottom of the file


Extracts a sorted and deduplicated list of hashtags from a text.

The hashtag regex given here only catches latin hashtags that start with a letter. The [twitter hashtag regex] is way more complicated.

This comment has been minimized.

@budziq

budziq Jul 16, 2017

Collaborator

I would add some adjective here

The [twitter hashtag regex] is way more complicated.

Like:
The complete [twitter hashtag regex] is way more complicated.


```rust
extern crate regex;
#[macro_use] extern crate lazy_static;

This comment has been minimized.

@budziq

budziq Jul 16, 2017

Collaborator

if using lazy_static please add its tag next to regex in both TOC's and example header

Also some description of what is going on here with lazy_static would be needed in the first (order of appearance) regex+lazy_static example

r"\#[a-zA-Z][0-9a-zA-Z_]*"
).unwrap();
}
let mut results = vec![];

This comment has been minimized.

@budziq

budziq Jul 16, 2017

Collaborator

why not just map and collect into HashSet or BTreeSet (if you need sorted results)?

@Phaiax Phaiax force-pushed the Phaiax:ex-hashtag-242 branch from 711b579 to c5793b1 Jul 22, 2017

@budziq
Copy link
Collaborator

budziq left a comment

Nearly perfect! just few more touches. Sorry the review slipped past me for quite some time

@@ -535,3 +576,4 @@ fn main() {
<!-- Reference -->

[race-condition-file]: https://en.wikipedia.org/wiki/Race_condition#File_systems
[twitter-hashtag-regex]: https://github.com/twitter/twitter-text/blob/master/java/src/com/twitter/Regex.java#L255

This comment has been minimized.

@budziq

budziq Jul 22, 2017

Collaborator

you can have the link text [twitter hashtag regex] without hyphens to avoid link remapping.
also linking to master branch will likely make the reference evaporate. I would suggest linking to specific git revision.


Extracts a sorted and deduplicated list of hashtags from a text.

The hashtag regex given here only catches latin hashtags that start with a letter. The complete [twitter hashtag regex][twitter-hashtag-regex] is way more complicated.

This comment has been minimized.

@budziq

budziq Jul 22, 2017

Collaborator

you can avoid relinking please see my comment below

fn main() {
let tweet = "Hey #world, I just got my new #dog, say hello to Till. #dog #forever #2 #_ ";
let tags = extract_hashtags(tweet);
assert!(tags.contains("#dog"));

This comment has been minimized.

@budziq

budziq Jul 22, 2017

Collaborator

could you assert on the whole contents to show that its deduplicated?

assert!(tags.contains("#dog"));
// Optional: Convert &str to String
let _tags : HashSet<String> = tags.iter().map(|&s| s.to_owned()).collect();

This comment has been minimized.

@budziq

budziq Jul 22, 2017

Collaborator

this is a little besides the point . I would suggest removing it.

@Phaiax Phaiax force-pushed the Phaiax:ex-hashtag-242 branch from c5793b1 to 8b529d2 Jul 23, 2017

@Phaiax Phaiax changed the title Add Extract unique Hashtag example Add Extract unique Hashtag example. Fixes #242 Jul 23, 2017

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Jul 25, 2017

I rebased and merged manually. Thanks @Phaiax

@brson brson closed this Jul 25, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.