Need to be able to create indexing jobs that define themselves. This way we can create new indexing jobs for new file formats. Lets also think about how we can integrate none file formatted data into our indexing scheme. We should be able to define download links and the right strategy to read from that source.