Skip to content

Releases: KalimeroMK/RssFeed

Code update

30 May 17:24
Compare
Choose a tag to compare

• Switched from SimpleXMLElement to SimplePie for more robust RSS feed parsing.
• Simplified feed parsing logic with SimplePie.
• Simplified image extraction using regex directly from feed item descriptions.
• Removed dependency on DOMDocument and DOMXPath for HTML parsing.
• Maintained robust image saving logic to download and save images to storage.
• Removed fetchContentUsingCurl and convertToAbsoluteUrl methods, reducing code complexity.
• Simplified error handling and logging for RSS feed parsing.
• Added dependency on Laravel’s Container for creating SimplePie instances, ensuring better integration with Laravel’s service container.

Image return update

07 Mar 14:23
Compare
Choose a tag to compare
  1. Find Images: The code then looks for tags within each found element.
  2. $images = $xpath->query('.//img', $element): This line finds all elements within the current element and stores them in $images.
  3. Iterate over Found Images: The loop iterates over each image found in the current element.
  4. foreach ($images as $img): This starts another loop for each image found within the current element.
  5. Extract and Validate Image URLs: The code extracts the URL from each image and checks whether it's a valid URL or a placeholder. If it's a placeholder, the code attempts to find a real image URL from alternative attributes commonly used with lazy-loaded images.
  6. $src = $img->getAttribute('src'): This line gets the src attribute of the image.
  7. If the src is a data URI or SVG (indicating a placeholder), the code checks other attributes like data-src, data-lazy-src, or data-original for an actual image URL.
  8. Convert Relative URLs to Absolute: If a valid image URL is found and it's a relative URL, the code converts it to an absolute URL based on the page's domain.
  9. $src = $this->convertToAbsoluteUrl($src, ...): This calls a method that converts a relative URL to an absolute one.
  10. Store Unique URLs: Finally, if the image URL hasn't already been added to the $imageUrls array, the code adds it.
  11. if (!in_array($src, $imageUrls)) { $imageUrls[] = $src; }: This checks if the URL is already in the array and, if not, adds it.

Add missing facade

07 Mar 12:20
Compare
Choose a tag to compare

use Illuminate\Support\Facades\Log; add

Code refactor

07 Mar 12:12
Compare
Choose a tag to compare

Code refactor

Add new featurend

23 Feb 13:02
Compare
Choose a tag to compare
  • namic XPath Configuration: The method now dynamically selects XPath queries based on the domain of the RSS feed. This allows for custom content scraping strategies for different websites.
  • Image Size Filtering: Added functionality to filter images by their width, ensuring that only images larger than a specified width (e.g., 600px) are considered. This helps in focusing on significant images only.
  • Unique Images: Updated the logic to ensure that only unique images are returned by the method, eliminating duplicates and reducing unnecessary data.
  • Domain-based Configuration: Shifted to a domain-based configuration approach for XPath queries, allowing for more granular and accurate content extraction tailored to each specific source or domain.
  • Configuration File Usage: The class now leverages a configuration file (rssfeed.php) for setting parameters like minimum image width and domain-specific XPath queries, offering a centralized place for configurations.

v2.1

03 Feb 12:53
Compare
Choose a tag to compare

Some small updates

v2

03 Feb 12:37
Compare
Choose a tag to compare
v2
  1. Integration of cURL for Fetching Data:
  2. Image Extraction from Content:
  3. XPath Configuration for Content Selection:
  4. Support for Multiple div Classes:
  5. Expanded the configuration to include an additional div class, demonstrating how to target content areas with various class attributes. This showcases the method's adaptability to different webpage structures.
  6. Error Handling and Robustness:
  7. Implemented error handling in various parts of the code to gracefully manage exceptions and unexpected situations, such as HTTP errors or issues with image URLs. This increases the code's robustness and reliability.
  8. Return Structured Data:
  9. Adjusted methods to return structured data, including both textual content and arrays of image URLs, providing a comprehensive overview of the processed content.
  10. Flexible Content Extraction:
  11. Demonstrated how to extract and concatenate selected HTML content, preserving tags while optionally removing scripts and styles, thus maintaining the relevance and cleanliness of the content.

v1.2.1

03 Feb 12:18
Compare
Choose a tag to compare

Code refactore

v1.2

14 Jun 06:47
35ca80a
Compare
Choose a tag to compare

Make RssFeed class implement ShouldQueue with Dispatchable so it can be used as a job

Added return type

11 Jun 23:06
78bf34b
Compare
Choose a tag to compare

Added return type array to the parseRssFeeds method.
Added return type string to the saveImageToStorage method.
Added return type bool|string to the retrieveFullContent method.
Added return type string|null to the getImageWithSizeGreaterThan method.
Added type annotations to the properties within the foreach loops, such as (string).