Skip to content
This repository has been archived by the owner on Sep 26, 2023. It is now read-only.

Commit

Permalink
Merge 27d9703 into e29bced
Browse files Browse the repository at this point in the history
  • Loading branch information
albertpastrana committed Sep 15, 2015
2 parents e29bced + 27d9703 commit 09e7de2
Show file tree
Hide file tree
Showing 2 changed files with 43 additions and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
@@ -1,4 +1,4 @@
#Gander [![Build Status](https://img.shields.io/travis/intenthq/gander.svg)](https://travis-ci.org/intenthq/gander) [![Coverage Status] (https://img.shields.io/coveralls/intenthq/gander.svg)](https://coveralls.io/github/intenthq/gander?branch=master) [![Maven Central](https://img.shields.io/maven-central/v/com.intenthq/gander_2.11.svg)](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22com.intenthq%22%20AND%20a%3A%22gander_2.11%22) [![Join the chat at https://gitter.im/intenthq/gander](https://img.shields.io/badge/gitter-join%20chat-green.svg)](https://gitter.im/intenthq/gander)
#Gander [![Build Status](https://img.shields.io/travis/intenthq/gander/master.svg)](https://travis-ci.org/intenthq/gander) [![Coverage Status] (https://img.shields.io/coveralls/intenthq/gander.svg)](https://coveralls.io/github/intenthq/gander?branch=master) [![Maven Central](https://img.shields.io/maven-central/v/com.intenthq/gander_2.11.svg)](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22com.intenthq%22%20AND%20a%3A%22gander_2.11%22) [![Join the chat at https://gitter.im/intenthq/gander](https://img.shields.io/badge/gitter-join%20chat-green.svg)](https://gitter.im/intenthq/gander)

**Gander is a scala library that extracts metadata and content from web pages.**

Expand Down
42 changes: 42 additions & 0 deletions src/test/scala/com/intenthq/gander/ContentExtractorSpec.scala
Expand Up @@ -17,6 +17,48 @@ class ContentExtractorSpec extends Specification {
}
}

"extractCanonicalLink" >> {
"should return none if no link found" >> {
val html =
"""<html lang="ca">
| <head>
| </head>
|<body></body></html>""".stripMargin
extractCanonicalLink(Jsoup.parse(html)) must beNone
}

"should extract the canonical link from the meta tag" >> {
val html =
"""<html lang="ca">
| <head>
| <link rel="canonical" href="http://example.com/canonical">
| <meta property="og:url" content="http://example.com/og" />
| <meta name="twitter:url" content="http://example.com/twitter" />
| </head>
|<body></body></html>""".stripMargin
extractCanonicalLink(Jsoup.parse(html)) must beSome("http://example.com/canonical")
}
"should extract the facebook og:url meta tag" >> {
val html =
"""<html lang="ca">
| <head>
| <meta property="og:url" content="http://example.com/og" />
| <meta name="twitter:url" content="http://example.com/twitter" />
| </head>
|<body></body></html>""".stripMargin
extractCanonicalLink(Jsoup.parse(html)) must beSome("http://example.com/og")
}
"should extract the twitter:url meta tag" >> {
val html =
"""<html lang="ca">
| <head>
| <meta name="twitter:url" content="http://example.com/twitter" />
| </head>
|<body></body></html>""".stripMargin
extractCanonicalLink(Jsoup.parse(html)) must beSome("http://example.com/twitter")
}
}

"extractLang" >> {
"should extract lang from html tag and give priority to it" >> {
val html =
Expand Down

0 comments on commit 09e7de2

Please sign in to comment.