simple example
C Shell
Switch branches/tags
Nothing to show
Latest commit b8ef3ab May 2, 2012 @waitman waitman up 2012-05-01
Permalink
Failed to load latest commit information.
README
build-init.sh
build.sh
init
init.c
sources.txt
tmo
tmo.c

README

tmo.c example 

This simple example uses the MongoDB C Driver, pthreads and libcurl to fetch documents and insert into MongoDB. Parallel processing allows Documents to be retreived and stored simultaneously. 

URLs are loaded into a global array urls[]. A pointer to the index of each URL is passed to fetch and callback functions.

I have tried this sample code with up to 100 URLs, however the user should take care not to blow up their system with too many threads. 

There is limited error handling in this simple example. There also appears to be a condition where certain characters, or perhaps document encoding, cause a segfault on insert. Checking into this.

In a future example I will include code for parsing and manipulation of fetched data.

Waitman Gobble
San Jose, California USA

Update 2012-04-30

1). remove free w/o malloc
2). add status check block to avoid compiler warnings
3). add build.sh to avoid compiler warnings about comments
4). add script to insert test data
5). remove abuff / memcopy and use buffer from curl callback

Update 2012-05-01

6). added curl error handling code
7). added response header store
8). added lastModifiedTime store
9). added unique run id store
10). initial libxml2 doc parsing code


$ sh build-init.sh 
$ ./init
$ mongo argang
MongoDB shell version: 2.0.4
connecting to: argang
> show collections;
sources
system.indexes
> db.sources.find();
{ "_id" : ObjectId("4f9f47834b12106200000000"), "url" : "http://www.guardian.co.uk/rss" }
{ "_id" : ObjectId("4f9f47834b12106200000001"), "url" : "http://rss.cnn.com/rss/cnn_topstories.rss" }
{ "_id" : ObjectId("4f9f47834b12106200000002"), "url" : "http://www.alertnet.org/thenews/rss/index.xml" }
{ "_id" : ObjectId("4f9f47834b12106200000003"), "url" : "http://rss.mnginteractive.com/live/sanjose/SanJose_1923271.xml" }
{ "_id" : ObjectId("4f9f47834b12106200000004"), "url" : "http://rssfeeds.usatoday.com/usatoday-NewsTopStories" }
{ "_id" : ObjectId("4f9f47834b12106200000005"), "url" : "http://feeds.sfgate.com/sfgate/rss/feeds/news" }
{ "_id" : ObjectId("4f9f47834b12106200000006"), "url" : "http://my.abcnews.go.com/rsspublic/fp_rss20.xml" }
{ "_id" : ObjectId("4f9f47834b12106200000007"), "url" : "http://feeds.cbsnews.com/CBSNewsMain" }
{ "_id" : ObjectId("4f9f47834b12106200000008"), "url" : "http://www.nytimes.com/services/xml/rss/nyt/HomePage.xml" }
>