-
Notifications
You must be signed in to change notification settings - Fork 1
How can we share Sequelize models between podverse-web and podverse-feedparser as separate apps? #1
Comments
I'm also considering whether there is a 4th app that needs to be here, podverse-api? In that case I imagine I would remove all references to models and database stuff in podverse-web and podverse-feedparser, then whenever a database interaction needs to happen, I would make a request to podverse-api. Decoupling all the db stuff from -web and -feedparser sounds like more work than I prefer right now, buuut if that is a good pattern then I am up for doing it. |
In the spirit of open source and reusable tools this is how I'd stab at architecting it High LevelThere is a podcast-db application. It is the authority of having Podcast and Episode models and associated rss links and all that. It knows nothing about MediaRefs / clips / playlists etc. It exposes two APIs, one is a RESTful API and the other is the seuqelize models that it owns. Focus on the latter because it is more relevant for podcast-web. Podverse-web consumes the podcast-db application. This means exposing an npm package... could use a git:// url for now. Essentially the podcast-db exposes sequelize models that know how to interact with a postgres instance containing all the podcast information. Podverse-web also contains information about MediaRefs and Clips, Playlists, etc. Podverse-api could happen but possibly later. Because of this architecture.. a mediaref wouldn't necessarilly be directly linked to an Episode in the database.. but that's fine for the sake of being more decentralized and decoupled podcast-db (lower level)It doesn't need to run on a port or a server like with express/feathers. Really it is a set of routines and some sort of job queue mechanism. These routines would probably be executed as a command line application or something similar. No need to get HTTP involved to invoke these routines (beyond fetching rss feeds and Routine: update rss feedGiven an RSS feed, update the podcast-db. Routine: add rss feedGiven an RSS feed, add a podcast/episode to the podcast-db. Routine: determine which rss feeds need to be updatedWhen executed (perhaps hourly) it should result in adding the set of routines that need to be executed. It could be a query like "give me all podcast rss urls which have a last updated date older than 48 hrs" job queue mechanismThere are a lot of these. Some you can run on yourself and some you can leverage a cloud service. https://aws.amazon.com/sqs/ There's amazon-- azure has one. Fundamentally it is the orchestrator of a task and ensuring that it is queued up and that it was completed in a robust way. It would take orders to execute routines and make sure that they get executed. But maybe we don't need to build podcast-db so quicklyConsider using the audiosear.ch api... You can send it a term such as the podcast "Invisibilia" to it's api and it will return a json payload that will have everything we need to get podcast/episode information based on searching for it. It is essentially the podcast-db piece but via a RESTful api |
I just played around with audiosear.ch api, and one issue I have is a few podcasts I listen to a lot (Rubin Report, #WeThePeople Live, Peace Propaganda) are not in the system. It looks like this doesn't have to be a show stopper though, as the website takes suggestions: When I query for a podcast by ID (887 for "Waking Up with Sam Harris"), I see 39 episode IDs, but there are 61 episodes in Sam Harris's RSS feed, so Audiosear.ch apparently can return inaccurate data.
|
Ooo Iiiii seee now. After rereading your proposal, audiosear.ch sounds very viable to me. Why run all these RSS parsing jobs ourselves, when someone else is already doing it? Audiosear.ch can take a huge load off our backs for the podcasts they support. I'd still like to have our own RSS feed parser though, that we use just to fill in the gaps for feeds that audiosear.ch doesn't provide yet (like Waking Up with Sam Harris, and Peace Propaganda). By drastically limiting the number of feeds we parse ourselves, our feed parser robustness should be more manageable... Whooops didn't mean to press close |
There may be some dealbreakers for sure. Having a podcast/episode in one request is not one of them-- it's not a big deal in the grand scheme of things to have that as two requests at the moment. Not having control of which feeds show up is another. It could be that audiosear.ch has strict standards to ensure their clean database. And so they gatekeeper rss feeds. It is an illustration that this piece really is a whole other app sort of unrelated to the core podverse mvp. I'd be happier with a micro-service running on lambda or something that took an RSS feed, converted it into json, and shoved it back to the client to store in local storage. No need to maintain a database of podcast/episodes. |
I hear ya on how not storing podcasts / episodes to db would simplify things greatly. The only thing is I cannot imagine a UX where I would want to track down an RSS feed link every time I want to use the web clipper. Moreso I can't imagine most of my friends / family ever doing that. I can however imagine myself and others clicking a Search icon, typing in the name of the podcast we are looking for, and then listening to an episode and making clips that way. Also I want a web clipper with a good UX because I don't want users to be totally limited to iOS (if we were to go solely the mobile app route). Sooo the reason I'm not going the localStorage route is 100% UX related. If I am misunderstanding and there is a way to accomplish this UX without a podcast-db then I am interested in simplifying things. In the mean time I'll be working getting podcast-db to work with podverse-web locally today. |
Basically what I am planning on doing today is:
|
@scvnc podverse-web and podcast-db have been decoupled. podverse-web fires up for me when I run npm start, and all the features seem to be working. I can populate the db with feeds / podcasts / episodes with this cli command:
There's some hardcoded db stuff going on in podverse-web and podcast-db that I assume will have to get cleaned up for deployment. I'm thinking I'll work on the shell scripts and queue stuff next. That stuff is more new territory for me. I'll look up tutorials on SQS and see what I can do... |
With SQS and webfaction we would need to have a cron job on webfaction invoke every 5 minutes (or another considered inverval). It would connect to SQS and retrieve one message (which is a task for feedparsing) and then it would do the feedparsing job (add or update feed). After it's done (or errored!!) it would have to interact with SQS again. If it's done, then it has to tell SQS to delete the message from the queue because it was successfully parsed. If it has errored, then we need to log that somewhere nice and then delete the message from SQS. The other task is "determine which rss feeds need to be updated"... which should probably be a daily script which combs through the lastUpdated in the podcast's and adds appropriate update jobs to the SQS queue. They will later get picked up by the previously illustrated cronjob. |
First, a rough idea of how I imagine podverse-feedparser working:
podverse-feedparser, podverse-web, and the podverse PostgreSQL database all listen on their separate ports, deployed on their separate servers.
Every few hours or so, a cron job triggers podverse-feedparser to query for all podcast RSS feed URLs in the database.
The parseFeeds method is called with the array of all the feed URLs currently in the db. parseFeeds adds each of these feeds on a queue to be parsed.
The parseFeeds queue runs sequentially, calling the parseFeed method with each URL until finished. As it goes parseFeed writes updated podcast and episode feeds to the PostgreSQL db. (This parseFeed function already exists in podverse-web here.)
I feel confident I can write code to make each of these things happen, but I am not sure how to elegantly reuse the podverse-web repositories/sequelize/engineFactory.js and models in the separate podverse-feedparser app.
I considered using npm install git://podverse-web as a dependency in podverse-feedparser, then somehow loading the models within podverse-feedparser by loading podverse-web files available in node_modules...but I'm not quite sure how I'd do that yet, and I wonder if I'm heading down the wrong path.
Having two separate apps that share a PostgreSQL db is new territory for me. Any tips on how to architect this stuff?
The text was updated successfully, but these errors were encountered: