- git clone this repo to your local environment
git clone https://github.com/lbjay/ds4l-talk
cd ds4l-talk
- run
pip install -r requirements.txt
- you may need to use
sudo
for this command to work, e.g.sudo pip install -r requirements.txt
- if
pymongo
fails because of missing compiler trypip install --install-option="--no_ext" pymongo
- you may need to use
- create a free MongoDB instance at https://bridge.mongohq.com/signup
- create account by filling out the initial signup form
- on the next form look for the tiny link that says "skip this step and use a free database".
- create a new "Sandbox" database
- name it whatever, "ds4l", "DS4L", etc.
- go ahead and add a user/password combo (use whatever, "foo/bar" is fine)
- MongoDB
- basic operations and commands
- connect via MongoClient
from pymongo import MongoClient
- `mongo = MongoClient('mongodb://:@dharma.mongohq.com:10045/ds4l')
- examine the db and collection commands
db = mongo.ds4l
coll = db.foo
- inserts, queries, cursors, updates
coll.insert({'a': 1, 'b': 2, 'c': 3})
, ...coll.find_one({'b': 2})
, ...c = coll.find({})
, ...coll.update({'a': 1}, {'$set': {'c': 4}})
, ...coll.insert({'a': 1, 'd': {'foo': 'bar', 'baz': [1,2,3]}})
, ...
- connect via MongoClient
- bibliographic data in MongoDB
- import fdocs.json
mongoimport -h dharma.mongohq.com:10045 --db ds4l --collection=fdocs -u<user> -p<pass> < fdocs.json
- mongohq collection copy: import from mongodb://foo:bar@dharma.mongohq.com:10045/ds4l
- example queries
fdocs.find({'refereed': True}).count()
fdocs.find({'references': '2003ApJ...584L..13F'}).count()
fdocs.find({'references': { '$all': ['2003ApJ...584L..13F','2003ApJ...589L..41S']}}).count()
fdocs.find({'citation_count': {'$gt': 10}}).count()
fdocs.find({'$where': 'this.references.length > 30'}).count()
- import fdocs.json
- aggregation and simple mapreduce example
- basic operations and commands
- Logstash
- basic input/output
- stdin/stdout
java -jar logstash.jar agent -f logstash-simple.conf
- tcp
java -jar logstash.jar agent -f logstash-tcp.conf
nc localhost 3333
- logfile/elasticsearch
- edit logstash-es.conf and update file path
java -jar logstash.jar agent -f logstash-es.conf -- kibana --backend elasticsearch://localhost/
head apache.log.example > apache.log
- stdin/stdout
- pubsub w/ redis
java -jar logstash.jar agent -f logstash-redis.conf
- push apache.log events to redis
- filters
java -jar logstash.jar agent -f logstash-filter1.conf
- push apache.log events to redis
- output to MongoDB
- basic input/output
- Usage data collection & analysis
- reading events from apache log
- creating custom events
- generating simple metrics