Skip to content
master
Switch branches/tags
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

zweischlagzeilen Project -- ZwoSchlagzeilen Twitter-Bot

@ZwoSchlagzeilen ist ein Twitter-Bot, der Schlagzeilen großer deutscher Nachrichtenportale verquirlt und stündlich postet. Er generiert im besten Fall so etwas wie "Fast Richtige Schlagzeilen" (Titanic). Im Grunde genommen ist es die deutsche Variante für das amerikanische Vorbild @TwoHeadlines.

--

@ZwoSchlagzeilen is a Twitter-Bot that mixes recent headlines of leading German newspapers and posts one of them each hour. Sometimes it produces funny results. It is basically the German variant for the American archetype @TwoHeadlines.

How does it work?

The script works quite simple. Basically, it uses feedparser to fetch the headlines of several RSS newsfeeds of German newspaper websites. Then, it randomly selects two headlines and mixes them. The mixing is done as follows: At first, Part-of-Speech (POS) tagging is applied to the headlines in order to find out the nouns in the headlines. For accurate results, I use a trained tagger for German language text like explained in this blog post. The tagger is a trained using Philipp Nolte's ClassifierBasedGermanTagger with the TIGER corpus from the University of Stuttgart. After POS tagging, a random number of nouns from one headline is selected and replaced with some nouns from the other headline so that we get our mixed headline. If we are below the Twitter character limit, we have our final headline, otherwise the whole process starts all over. Finally, the random headline is posted to the twitter account using tweepy.

Requirements

The script has been tested with Python 2.7.

The following Python packages need to be installed -- all are available via PyPI:

  • tweepy
  • feedparser
  • nltk

NLTK needs additional data for the tokenizer. You should execute nltk.download('punkt') for that.

"conf.py" is not in the repository. You will need to create a file "conf.py" with the following content:

TWITTER_CONSUMER_KEY = '...'
TWITTER_CONSUMER_SECRET = '...'
TWITTER_ACCESS_KEY = '...'
TWITTER_ACCESS_SECRET = '...'

# cut out the following substrings from headlines:
CUT_SUBSTRINGS = (
    u'*** BREAKING NEWS ***',
)

N_MAX_TRIES = 10

About

A Twitter-Bot that mixes recent headlines of leading German newspapers.

Topics

Resources

License

Releases

No releases published

Packages

No packages published

Languages