Skip to content

signout/arkiv24syv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 

Repository files navigation

arkiv24syv

This can (when finished) create a mirror of 24syv podcast archive

URL for the archive is https://arkiv.radio24syv.dk/
URL for a specific podcast is https://arkiv.radio24syv.dk/audiopodcast/channel/4466232
URL for all podcasts as RSS can be found at https://arkiv.radio24syv.dk/rss it gives 100 results per page and the rest are on sequentially numbered URLs like http://arkiv.radio24syv.dk/rss?p=2

pkg install python3 pkg install py36-feedparser pkg install sqlite3 pkg install py36-sqlite3 pkg install py36-requests

pip3 install requests

CREATE INDEX idx_guid ON items(guid,downloaded);

DB layout

sqlite3 arkiv24syv.db
CREATE TABLE IF NOT EXISTS items (
  x INTEGER PRIMARY KEY ASC,
  entirepost,
  contenturl,
  filesize,
  title,
  link,
  description,
  guid,
  pubdate,
  duration
);
PRAGMA table_info(items);

RSS entry

..looks like

<item>
  <enclosure url="https://arkiv.radio24syv.dk/49543331/54202478/dfc90ad8eaf93453562ed3961e565a5f/video_medium/nyheder-1000-28-07-2019-video.mp4?source=podcast" type="video/mp4" length="3550177"/>
  <title>Nyheder 10.00 28-07-2019</title>
  <link>https://arkiv.radio24syv.dk/photo/54202478/nyheder-1000-28-07-2019</link>
  <description>
    <p>
      <p>Nyheder fra Radio24syv</p>
      <ul>
        <li>Kriminalitetsnævn afgør flest sager om børn under 15 år</li>
        <li>Vrede guatemalanere kalder asylaftale med USA umoralsk</li>
        <li>Voldsom stigning i antallet af indbrud mod tandlægeklinikker</li>
        <li>Hvis for mange danskere anskaffer sig en elbil...</li>
        <li>Lars Bak parkerer cyklen efter sæsonen</li>
      </ul>
    </p>
    <p>
      <a href="https://arkiv.radio24syv.dk/photo/54202478/nyheder-1000-28-07-2019">
        <img src="https://arkiv.radio24syv.dk/49543331/54202478/dfc90ad8eaf93453562ed3961e565a5f/standard/download-thumbnail.jpg" width="1400" height="1400"/>
      </a>
    </p>
  </description>
  <guid>https://arkiv.radio24syv.dk/photo/54202478</guid>
  <pubDate>Sun, 28 Jul 2019 10:00:00 GMT</pubDate>
  <itunes:summary>
    Nyheder fra Radio24syv
Kriminalitetsnævn afgør flest sager om børn under 15 årVrede guatemalanere kalder asylaftale med USA umoralskVoldsom stigning i antallet af indbrud mod tandlægeklinikkerHvis for mange danskere anskaffer sig en elbil...Lars Bak parkerer cyklen efter sæsonen
  </itunes:summary>
  <itunes:subtitle>
    Nyheder fra Radio24syv
Kriminalitetsnævn afgør flest sager om børn under 15 årVrede guatemalanere kalder asylaftale med USA umoralskVoldsom stigning i antallet af indbrud mod tandlægeklinikkerHvis for mange danskere anskaffer sig en elbil...Lars...
  </itunes:subtitle>
  <itunes:author>Radio24syv</itunes:author>
  <itunes:duration>04:59</itunes:duration>
  <itunes:image href="https://arkiv.radio24syv.dk/49543331/54202478/dfc90ad8eaf93453562ed3961e565a5f/standard/download-thumbnail.jpg/thumbnail.jpg"/>
</item>

Occationally it seems like they insert items other than at the top of the list. Maybe I should load two or three pages or maybe sift through all of them from time to time.

Looks like this

Inserted 54754594
Inserted 54753152
Inserted 54751787
Inserted 54591692
Inserted 54754669
Inserted 54750313
Inserted 54748965
Inserted 54750298
Inserted 54747800
Inserted 54748950
Inserted 54746733
Inserted 54747794
Inserted 54745667
Skipped 54744568
Inserted 54746811
Skipped 54743553
Skipped 54742558
Skipped 54741514
Skipped 54740278

or this ``` Inserted 54933407 Inserted 54933584 Inserted 54932188 Inserted 54933574 Inserted 54930479 Inserted 54932213 Inserted 54928820 Inserted 54927425 Inserted 54928840 Inserted 54926003 Inserted 54924734 Inserted 54799396 Inserted 54927498 Inserted 54923337 Inserted 54922061 Inserted 54923314 Inserted 54920996 Inserted 54922062 Inserted 54919970 Inserted 54920981 Inserted 54918865 Inserted 54917706 Inserted 54920031 Inserted 54916517 Inserted 54915408 Inserted 54914257 Inserted 54915437 Inserted 54912578 Inserted 54911431 Inserted 54912574 Inserted 54927740 Skipped 54909210 Inserted 54927873 Inserted 54927722 Skipped 54905981 Skipped 54905963 Skipped 54908947 Skipped 54907872 Skipped 54904056 Inserted 54927714 Skipped 54904047 Skipped 54900500


About

This can create a mirror of 24syv podcast archive

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages