Skip to content

Podcast RESTful API with Custom Authentication & Advanced Scraping with custom JWT-based authentication and advanced Celery-managed scraping. It also Elasticsearch for robust event tracking and analysis.

Notifications You must be signed in to change notification settings

MrSinaxx/Final-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Setup

Follow these steps to set up the project locally:

  1. Clone the repository:

    git clone https://github.com/your-username/your-repo.git
  2. Create a virtual environment (optional but recommended):

    python -m venv venv
  3. Activate the virtual environment:

    • On Windows:

      venv\Scripts\activate
    • On macOS and Linux:

      source venv/bin/activate
  4. Install project dependencies:

    pip install -r requirements.txt
  5. Run migrations to create the database schema:

    python manage.py migrate
  6. Start the Django development server:

    python manage.py runserver

The project should now be running locally at http://localhost:8000/.

Adding RSS URLs from Django Admin Panel

You can easily add RSS URLs to your podcast database using the Django admin panel. Follow these steps:

  1. Open your web browser and navigate to http://localhost:8000/admin/.

  2. Log in with your admin credentials (you may need to create a superuser account if you haven't already).

  3. In the admin panel, locate the "Podcasts" section or the relevant model for managing podcast entries.

  4. Click on "Add" or "Add Podcast" to create a new podcast entry.

  5. Fill in the required details, including the RSS URL, title.

  6. Save the entry.

You can repeat these steps to add more podcast entries as needed.

Running the Parsing Process

To parse the podcast feeds and populate your database with podcast episodes, you can use the following management command:

python manage.py parse_podcast

This command will trigger the parsing process, fetching the RSS feeds you added through the Django admin panel and updating your database with podcast episode data.

Django XML RSS Reader and Podcast Endpoint

This Django project serves as an XML RSS feed reader with a podcast and episode endpoint. It allows you to fetch, parse, and store podcast metadata and episodes from RSS feeds. Below are the project's URLs along with their descriptions.

URLs

1. Podcast List and Creation

  • URL: /podcasts/
  • Description: This endpoint allows you to list all podcasts and create new ones.

2. Podcast Detail

  • URL: /podcasts/<int:pk>/
  • Description: Retrieve details of a specific podcast using its ID.

3. Podcast Episode List and Creation

  • URL: /podcasts/<int:pk>/episodes/
  • Description: List all episodes of a specific podcast and create new episodes for that podcast.

4. Episode Detail

  • URL: /episodes/<int:pk>/
  • Description: Retrieve details of a specific podcast episode using its ID.

Usage

To use these endpoints, you can make GET and POST requests to the URLs listed above as per your requirements. For example:

  • To list all podcasts, make a GET request to /podcasts/.
  • To retrieve details of a specific podcast with ID 1, make a GET request to /podcasts/1/.
  • To list episodes of a specific podcast with ID 1, make a GET request to /podcasts/1/episodes/.
  • To retrieve details of a specific episode with ID 1, make a GET request to /episodes/1/.-

Advanced Analysis and Data Mapping Approaches

This project utilizes the xml.etree.ElementTree library to parse XML RSS feeds. The parse_rss_feed function fetches an RSS feed, extracts podcast metadata, and maps episode data to a structured format.

The advanced analysis and data mapping approaches used include:

  • Handling XML namespaces for iTunes elements (e.g., itunes:duration, itunes:explicit) by specifying the namespaces in ElementTree find queries.

  • Extracting podcast metadata such as title, summary, author, and more from the RSS feed's channel element.

  • Mapping episode data including title, duration, audio URL, publish date, explicit content, image URL, summary, and description.

Strategies for Handling Evolving Feed Structures

RSS feeds can evolve over time, adding or modifying elements. To handle evolving feed structures:

  1. Robust Parsing: Make parsing code robust to missing or changed elements by using conditional checks. If an element is not found or the structure changes, the code should gracefully handle it.

  2. Regular Testing: Regularly test the parser with a variety of RSS feeds to ensure it can adapt to different structures.

  3. Monitoring and Alerts: Implement monitoring and alerts for when the parser encounters unexpected changes in feed structures. This allows for manual intervention and code adjustments.

  4. Versioned Parsing: Consider creating parsers for different RSS feed versions if the changes are significant. This way, you can maintain compatibility with older feeds while adapting to new ones.

About

Podcast RESTful API with Custom Authentication & Advanced Scraping with custom JWT-based authentication and advanced Celery-managed scraping. It also Elasticsearch for robust event tracking and analysis.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Languages