Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue #93 - uses re to parse links in tweet #102

Closed
wants to merge 1 commit into from

Conversation

pravj
Copy link

@pravj pravj commented Apr 14, 2015

Before this, the open {id} command was using only prefix-matching to parse links from tweets.
In old way, it used to split the tweet text and collected all the words with http/https as prefix.

As explained in a sample tweet in issue #93:

Super excited to have David DeSandro (@desandro) helping us revamp forecast.io (soon to be http://darksky.net).

The parsed links here, will be 'http://darksky.net).' instead of 'http://darksky.net', which is wrong.

Now it uses regex module to parse the same.

As shortened URLs in tweets have 22/23 string length(based on protocol). So, all the short URLs have unique-id of length 10 as suffix.

So, a simple re.findall with https?://t.co/[a-zA-Z0-9]{10} as search pattern, will return a list of all the links in the tweet text.

That's it. :octocat:

> Initially the `open {id}` command was using only prefix-matching to
> parse links from tweets.

> Now it uses regex module to parse the same.

> As shortened URLs in tweets have 22/23 string length(based on
> protocol). So, all the short URLs have unique-id of length 10 as
> suffix.
@orakaro
Copy link
Owner

orakaro commented Apr 16, 2015

Hi @pravj ! Thanks for suggest this. I will test for a while before merge 🍺

@Tenzer
Copy link
Contributor

Tenzer commented May 26, 2015

Wouldn't it be simpler and more reliable to just use the entities list Twitter includes with every tweet? https://dev.twitter.com/overview/api/entities-in-twitter-objects#urls

@orakaro
Copy link
Owner

orakaro commented Jul 31, 2015

Thanks @Tenzer .I will take a further look for the entities list

@Tenzer
Copy link
Contributor

Tenzer commented Jul 31, 2015

@dtvd You can take a look at the change to repall I made, that was switched to use entities: #90.

@orakaro
Copy link
Owner

orakaro commented Jul 31, 2015

@Tenzer thanks. Just update this function using entities

@orakaro orakaro closed this Jul 31, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants