Skip to content

berkmancenter/tagteam

Repository files navigation

<img src=“https://travis-ci.org/berkmancenter/tagteam.svg?branch=master” alt=“Build Status” />

TagTeam

TagTeam is an RSS / Atom / RDF aggregator with the ability to filter and remix its input feeds with a high degree of flexibility.

Items can be added directly to TagTeam “bookmarking collections” via the provided delicious-like bookmarklet, and these items can be remixed and filtered like any other item.

TagTeam can aggregate content from anything that emits RSS, Atom, or RDF. This includes delicious, zotero, WordPress, twitter, mediawiki, connotea, blogger, github, and too many other applications and services to mention. It uses the feed-abstract gem, written as part of this project to create a better way of dealing with structured feeds. feed-abstract understands some generators and does magical things - like turning twitter hashtags into actual tags on aggregated items.

TagTeam can import Delicious and Connotea backups directly into a bookmark collection, and will support more formats soon.

Remixed feeds are available as RSS 2.0, Atom, and jsonp output and can be viewed directly in a hub. Feeds, FeedItems, and Tags can be added and removed from a Remixed feed contextually within the application.

Tag filters can be applied at different levels to allow hub owners to maintain a consistent collection of tags - you can take messy tags in and give clean tags out.

TagTeam can be many things to many people, allowing you to ignore or utilize its many features depending on what you need. Examples:

  • It can aggregate RSS feeds and simply give you a Planet-like collection of items from many sources,

  • It can be your own delicious-like bookmarking platform,

  • It can help you find items via its built-in search engine and through its support of “more like this” queries,

  • It can be a delivery platform to enable you to aggregate content and updates from all over your enterprise and all over the web as jsonp for use in web apps,

  • It can collect data about the input sources you watch (as it keeps detailed changelogs of all content it collects) for later analysis,

  • It can be a permanent archive of anything that emits formats TagTeam understands.

System Requirements

  • A ‘nix hosting environment,

  • postgres - though mysql should work with minimal changes, it’s just not tested,

  • redis - for resque background job processing,

  • java - for sunspot fulltext searching. This has been tested under openjdk-6-jre on multiple flavors of ubuntu,

  • ruby 2.4.1

  • wget

Installation

TagTeam is a fairly traditional rails 5.0 app that needs a java daemon and redis for job tracking / management.

  • Install all system requirements (above).

    • Redis is really only being used as a job queue, so configuring it to write to disk isn’t all that necessary.

    • If you get an installation failure related to openssl on OS X: +rvm install 2.4.1 –with-openssl-dir=/usr/local/opt/openssl+.

    • If you’re using rvm, create a gemset to hold this application’s gem environment

  • +git clone github.com:berkmancenter/tagteam.git tagteam && cd tagteam && bundle+

  • Review config files:

    • config/environments/production.rb: configure email delivery if “sendmail” isn’t good enough.

    • config/initializers/devise.rb: Check values, especially pepper and mailer_sender.

  • Set up .yml configs:

    • +cp config/sunspot.yml.example config/sunspot.yml+

    • +cp config/tagteam.yml.example config/tagteam.yml+

    • +cp config/database.yml.example config/database.yml+ and set it up to connect to your postgres database.

  • ‘cp .env.development .env` and alter it as needed for your environment.

  • Make sure sunspot is running all on the same port in sunspot.yml and solr/conf/scripts.conf or else the test suite will not run.

  • +export RAILS_ENV=production+

  • Set up databases and docs via +rake db:setup && rake db:migrate && rake db:seed+

  • +rake sunspot:solr:start+

  • Start sidekiq (+bundle exec sidekiq+) to run scheduled tasks

To run with ‘RAILS_ENV=production` on localhost

  • Add +config.serve_static_assets = true+ to config/production.rb

  • Comment out the lines with force_ssl in config/production.rb

    • _Remove both of the above before git push._

  • Install yarn

  • +rake assets:precompile+

  • In config/sunspot.yml, change the production solr path to /solr/default.

  • redis-server

  • +rake sunspot:solr:start+

  • +bundle exec sidekiq+

  • +RAILS_ENV=production rails s+

Testing

rspec

Development

The only thing you need is Docker (docs.docker.com/install/) and +Docker Compose+ (docs.docker.com/compose/install/).

  • Install Docker and Docker Compose

  • Run +docker-compose up+ and wait until it sets up everything

  • The app will be available on localhost:3000

Caching

Most of TagTeam is action cached for non-authenticated users, and sometimes for users without administrative rights. The default cache time is 15 minutes, and the file-based cache is expired via the cron jobs articulated above. Feel free to switch the caching backend, but honestly if you’re not clustering and have a moderately fast disk subsystem there’s almost no point. “rake tmp:clear” can be your cache-clearing sledgehammer, of course.

Architectural Overview

  • A Hub is the highest level of organization. It has many HubFeeds, HubTagFilters, and RepublishedFeeds. It also has many Feeds and FeedItems through these other relationships.

  • A HubFeed links a Hub and a Feed together, giving a Hub owner the ability to override the title and description for a Feed but only in this Hub.

  • A HubFeed has many FeedItems through the Feed it relates to. A HubFeed also has many HubFeedTagFilters. A HubFeed can be a bookmarking feed, which means it doesn’t have an actual RSS/Atom feed that it’s aggregating, it only serves to group items added via the Bookmarklet or by a direct import.

  • A Feed is an actual RSS or Atom feed that we’re aggregating and is unique to this entire TagTeam instance. Feed spidering events are tracked in a FeedRetrieval model. A Feed can be an InputSource for a RepublishedFeed.

  • A FeedItem belongs to many Feeds and is tracked in the changelog YAML contained in a FeedRetrieval where appropriate. It can serve as an InputSource for a RepublishedFeed. It can also have HubFeedItemTagFilters.

  • A Tag is provided by the ActsAsTaggableOn::Tag class and is related to a FeedItem through an ActsAsTaggableOn::Tagging relation. It can be an InputSource in a RepublishedFeed. It also serves as the sources used in the Hub, FeedItem, and Feed TagFilter classes.

  • A RepublishedFeed is a “remix” - it aggregates together InputSources (Feeds, FeedItems, and Tags) as defined by a Hub owner.

More info can be found in the rdoc-generated API docs (check out this repo and run “rake doc:app”) and in the doc/ddl.png graphical schema overview.

TODO

  • Add more InputSources for remixing, including searches, hubs, and remixed feeds,

  • Create a bookmarklet / chrome extension to streamline adding autodiscovered feeds from around the web,

  • Make TagTeam more social, exercising the excellent ACL9 framework for rights delegation,

  • Expose more of Sunspot’s capabilities in the frontend search interface,

  • Add more Tagteam::Importer subclasses to support more import file types,

  • Improve the UI to make the features more obvious,

  • Improve the API to allow objects to be managed via xml/json,

  • Better API docs.

Contributors

  • Dan Collis-Puro - everything technical

  • Peter Suber - the concept

License

TagTeam is licensed under the AGPL

2016 President and Fellows of Harvard College

Performance Monitoring