Skip to content

Commit

Permalink
Added more data for content extraction testing
Browse files Browse the repository at this point in the history
  • Loading branch information
peterc committed Jun 20, 2010
1 parent 0ac7193 commit da8099b
Showing 1 changed file with 20 additions and 4 deletions.
24 changes: 20 additions & 4 deletions test/corpus/metadata_expected.yaml
Expand Up @@ -2,6 +2,7 @@
:rww:
:title: "Cartoon: Apple Tablet: Now With Barometer and Bird Call Generator"
:feed: http://www.readwriteweb.com/rss.xml
:lede: I'm just aching to know if the new Apple tablet (insert caveats, weasel words and qualifiers here) is a potential Cintiq competitor. I don't think it will be, but you never know.
:feeds:
- http://www.readwriteweb.com/rss.xml
- http://www.readwriteweb.com/archives/2010/01/cartoon_apple_tablet_now_with_barometer_and_bird_c.xml
Expand Down Expand Up @@ -33,7 +34,6 @@
:lede: "Separation of concerns between Factor VM and library codeThe Factor VM implements an abstract machine consisting of a data heap of objects, a code heap of machine code blocks, and a set of stacks. The VM loads an image file on startup, which becomes the data and code heap. "
:ledes:
- "Separation of concerns between Factor VM and library codeThe Factor VM implements an abstract machine consisting of a data heap of objects, a code heap of machine code blocks, and a set of stacks. The VM loads an image file on startup, which becomes the data and code heap. "
- Slava Pestov's weblog, primarily about Factor.
:youtube:
:title: YMO - Rydeen (Official Video)
:author: ymo1965
Expand All @@ -42,8 +42,7 @@
:spolsky:
:title: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) - Joel on Software
:description: Haven't mastered the basics of Unicode and character sets? Please don't write another line of code until you've read this article.
:ledes:
- Ever wonder about that mysterious Content-Type tag? You know, the one you're supposed to put in HTML and you never quite know what it should be?
:lede: I've been dismayed to discover just how many software developers aren't really completely up to speed on the mysterious world of character sets, encodings, Unicode, all that stuff. A couple of years ago, a beta tester for FogBUGZ was wondering whether it could handle incoming email in Japanese.
:author: Joel Spolsky
:favicon: /favicon.ico
:feed: http://www.joelonsoftware.com/rss.xml
Expand All @@ -54,4 +53,21 @@
:title: "CoffeeScript: A New Language With A Pure Ruby Compiler"
:author: Peter Cooper
:lede: CoffeeScript (GitHub repo) is a new programming language with a pure Ruby compiler. Creator Jeremy Ashkenas calls it "JavaScript's less ostentatious kid brother" - mostly because it compiles into JavaScript and shares most of the same constructs, but with a different, tighter syntax.
:feed: http://www.rubyinside.com/feed/
:feed: http://www.rubyinside.com/feed/
:zefrank:
:sentences: If there's anyone who knows how to marshal an online audience, it's Ze Frank. Ze is best-known for his 2006 program "The Show," in which he made a new 2-3 minute video every day for 1 year. Topics ranged from "fingers in food" to the mysteries of airport signage to a tour de force summary of creatives' addiction to un-executed ideas, aka brain crack.
:title: "Ze Frank on Imaginary Audiences :: Articles :: The 99 Percent"
:description: We chat with the Internet's most notorious mass-collaboration instigator Ze Frank about idea execution and how to build armies of sportsracers.
:tweet:
:lede: Gobsmacked that TeX/LaTeX (document formatting tools) for OS X is a 1.3GB (yes, GIGAbytes) download OS X. Wow..!
:sentences: Gobsmacked that TeX/LaTeX (document formatting tools) for OS X is a 1.3GB (yes, GIGAbytes) download OS Wow..!
:datetime: 2010-06-05 12:00:00 +01:00
:cant_read:
:sentences: "For those of us who grew up as weird kids in the 1980s, the work of Berkeley Breathed was as important as those twin eternal pillars of weird-kid-dom: Monty Python and Mad magazine. In a word: seminal. In two words: fucking seminal."
:gmane:
:sentences: I am pleased to report that the GCC Steering Committee and the FSF have approved the use of C++ in GCC itself. Of course, there's no reason for us to use C++ features just because we can. The goal is a better compiler for users, not a C++ code base for its own sake.
:queness:
:title: 18 Incredible CSS3 Effects You Have Never Seen Before
:lede: "CSS3 is hot these days and will soon be available in most modern browser. Just recently, I started to become aware to the present of CSS3 around the web. "
:sentences: CSS3 is hot these days and will soon be available in most modern browser. Just recently, I started to become aware to the present of CSS3 around the web. I can see some of the websites such as twitter and designer portfolios websites are using it.
:datetime: 2010-06-02 12:00:00 +01:00

0 comments on commit da8099b

Please sign in to comment.