tweepy

Streaming

Streams utilize Streaming HTTP protocol to deliver data through an open, streaming API connection. Rather than delivering data in batches through repeated requests by your client app, as might be expected from a REST API, a single connection is opened between your app and the API, with new results being sent through that connection whenever new matches occur. This results in a low-latency delivery mechanism that can support very high throughput. For further information, see https://developer.twitter.com/en/docs/tutorials/consuming-streaming-data

Stream allows filtering and sampling of realtime Tweets using Twitter API v1.1.

StreamingClient allows filtering and sampling of realtime Tweets using Twitter API v2.

Using `Stream`

To use Stream, an instance of it needs to be initialized with Twitter API credentials (Consumer Key, Consumer Secret, Access Token, Access Token Secret):

import tweepy

stream = tweepy.Stream(
    "Consumer Key here", "Consumer Secret here",
    "Access Token here", "Access Token Secret here"
)

Then, Stream.filter or Stream.sample can be used to connect to and run a stream:

stream.filter(track=["Tweepy"])

Data received from the stream is passed to Stream.on_data. This method handles sending the data to other methods based on the message type. For example, if a Tweet is received from the stream, the raw data is sent to Stream.on_data, which constructs a Status object and passes it to Stream.on_status. By default, the other methods, besides Stream.on_data, that receive the data from the stream, simply log the data received, with the logging level <python:levels> dependent on the type of the data.

To customize the processing of the stream data, Stream needs to be subclassed. For example, to print the IDs of every Tweet received:

class IDPrinter(tweepy.Stream):

    def on_status(self, status):
        print(status.id)


printer = IDPrinter(
    "Consumer Key here", "Consumer Secret here",
    "Access Token here", "Access Token Secret here"
)
printer.sample()

Using `StreamingClient`

To use StreamingClient, an instance of it needs to be initialized with a Twitter API Bearer Token:

import tweepy

streaming_client = tweepy.StreamingClient("Bearer Token here")

Then, StreamingClient.sample can be used to connect to and run a sampling stream:

streaming_client.sample()

Or StreamingClient.add_rules can be used to add rules before using StreamingClient.filter to connect to and run a filtered stream:

streaming_client.add_rules(tweepy.StreamRule("Tweepy"))
streaming_client.filter()

StreamingClient.get_rules can be used to retrieve existing rules and StreamingClient.delete_rules can be used to delete rules.

To learn how build rules, refer to the Twitter API Building rules for filtered stream documentation.

Data received from the stream is passed to StreamingClient.on_data. This method handles sending the data to other methods. Tweets recieved are sent to StreamingClient.on_tweet, includes data are sent to StreamingClient.on_includes, errors are sent to StreamingClient.on_errors, and matching rules are sent to StreamingClient.on_matching_rules. A StreamResponse instance containing all four fields is sent to StreamingClient.on_response. By default, only StreamingClient.on_response logs the data received, at the DEBUG logging level <python:levels>.

To customize the processing of the stream data, StreamingClient needs to be subclassed. For example, to print the IDs of every Tweet received:

class IDPrinter(tweepy.StreamingClient):

    def on_tweet(self, tweet):
        print(tweet.id)


printer = IDPrinter("Bearer Token here")
printer.sample()

Threading

Stream.filter, Stream.sample, StreamingClient.filter, and StreamingClient.sample all have a threaded parameter. When set to True, the stream will run in a separate thread <python:thread-objects>, which is returned by the call to the method. For example:

thread = stream.filter(follow=[1072250532645998596], threaded=True)

or:

thread = streaming_client.sample(threaded=True)

Handling Errors

Both Stream and StreamingClient have multiple methods to handle errors during streaming.

Stream.on_closed / StreamingClient.on_closed is called when the stream is closed by Twitter.

Stream.on_connection_error / StreamingClient.on_connection_error is called when the stream encounters a connection error.

Stream.on_request_error / StreamingClient.on_request_error is called when an error is encountered while trying to connect to the stream.

When these errors are encountered and max_retries, which defaults to infinite, hasn't been exceeded yet, the Stream / StreamingClient instance will attempt to reconnect the stream after an appropriate amount of time. By default, both versions of all three of these methods log an error. To customize that handling, they can be overridden in a subclass:

class ConnectionTester(tweepy.Stream):

    def on_connection_error(self):
        self.disconnect()

class ConnectionTester(tweepy.StreamingClient):

    def on_connection_error(self):
        self.disconnect()

Stream.on_request_error / StreamingClient.on_request_error is also passed the HTTP status code that was encountered. The HTTP status codes reference for the Twitter API can be found at https://developer.twitter.com/en/support/twitter-api/error-troubleshooting.

Stream.on_exception / StreamingClient.on_exception is called when an unhandled exception occurs. This is fatal to the stream, and by default, an exception is logged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

streaming.rst

streaming.rst

Streaming

Using `Stream`

Using `StreamingClient`

Threading

Handling Errors

Files

streaming.rst

Latest commit

History

streaming.rst

File metadata and controls

Streaming

Using Stream

Using StreamingClient

Threading

Handling Errors

Using `Stream`

Using `StreamingClient`