Skip to content

Latest commit

 

History

History
455 lines (384 loc) · 14.7 KB

README.rst

File metadata and controls

455 lines (384 loc) · 14.7 KB

Python Loklak API

PyPI version Build Status Codecov branch Coverage Status Code Health Dependency Status


If you want to create an alternative twitter search portal, the only way would be to use the official twitter API to retrieve Tweets. But that interface needs an OAuth account and it makes your search portal completely dependent on Twitters goodwill. The alternative is, to scrape the tweets from the twitter html search result pages, but Twitter may still lock you out on your IP address. To circumvent this, you need many clients accessing twitter to scrape search results. This makes it neccessary to create a distributed peer-to-peer network of twitter scrapers which can all organize, store and index tweets. This solution was created with loklak.

What is Loklak ?

It is a server application which is able to collect messages from various sources, including twitter. The server contains a search index and a peer-to-peer index sharing interface.


Why should I use this ?

If you like to be anonymous when searching things, want to archive tweets or messages about specific topics and if you are looking for a tool to create statistics about tweet topics, then you may consider loklak. With loklak you can do: - collect and store a very, very large amount of tweets and similar messages - create your own search engine for tweets - omit authentication enforcment for API requests on the twitter plattform - share tweets and tweet archives with other loklak users - search anonymously on your own search portal - create your own tweet search portal or statistical evaluations - use Kibana to analyze large amounts of tweets as source for statistical data.


Documentation of the API and Usage Examples

To use the loklak app, first an object of the loklak type needs to be created. Do the following to install the pip module and add it to your requirements for the application. Currently our pip package is supported for python2.7 and lower venv versions. Will be supported in python3 version soon!

pip install python-loklak-api

To use the GUI client for API, install wxPython from here and then install Gooey pip module.

pip install Gooey

Then add

from gooey import Gooey
@Gooey

before def main(): in /bin/loklak

Loklak once installed, can be used in the application as

from loklak import Loklak

To create a loklak object you can assign the Loklak() object to a variable. variable = Loklak()

eg. l = Loklak() This creates an objects whose backend loklak server is http://loklak.org/

If you want to set this API to use your own server, you can now define it by doing
l = Loklak('http://192.168.192.5:9000/') for example or pass a URL to it as
l = Loklak('http://loklak-super-cluster.mybluemix.net/')

Note the trailing / is important and so is http://

API Documentation

Status of the Loklak server

Using the object created above, l.status() returns a json of the status as follows

{
  "system": {
    "assigned_memory": 2051014656,
    "used_memory": 1374976920,
    "available_memory": 676037736,
    "cores": 8,
    "threads": 97,
    "runtime": 734949,
    "time_to_restart": 85665051,
    "load_system_average": 18.19,
    "load_system_cpu": 0.24344589731081373,
    "load_process_cpu": 0.018707976134026073,
    "server_threads": 68
  },
  "index": {
    "mps": 176,
    "messages": {
      "size": 1195277012,
      "size_local": 1195277012,
      "size_backend": 0,
      "stats": {
        "name": "messages",
        "object_cache": {
          "update": 51188,
          "hit": 1796,
          "miss": 139470,
          "size": 10001,
          "maxsize": 10000
        },
        "exist_cache": {
          "update": 68419,
          "hit": 2450,
          "miss": 137020,
          "size": 68313,
          "maxsize": 3000000
        },
        "index": {
          "exist": 68634,
          "get": 0,
          "write": 51016
        }
      },
      "queue": {
        "size": 100000,
        "maxSize": 100000,
        "clients": 72
      }
    },
    "users": {
      "size": 65915082,
      "size_local": 65915082,
      "size_backend": 0,
      "stats": {
        "name": "users",
        "object_cache": {
          "update": 51827,
          "hit": 3756,
          "miss": 639,
          "size": 10000,
          "maxsize": 10000
        },
        "exist_cache": {
          "update": 56222,
          "hit": 0,
          "miss": 0,
          "size": 15933,
          "maxsize": 3000000
        },
        "index": {
          "exist": 0,
          "get": 639,
          "write": 51016
        }
      }
    },
    "queries": {
      "size": 4251,
      "stats": {
        "name": "queries",
        "object_cache": {
          "update": 452,
          "hit": 132,
          "miss": 3297,
          "size": 160,
          "maxsize": 10000
        },
        "exist_cache": {
          "update": 3703,
          "hit": 162,
          "miss": 2959,
          "size": 3002,
          "maxsize": 3000000
        },
        "index": {
          "exist": 2959,
          "get": 176,
          "write": 292
        }
      }
    },
    "accounts": {"size": 96},
    "user": {"size": 790137},
    "followers": {"size": 146},
    "following": {"size": 135}
  },
  "client_info": {
    "RemoteHost": "103.43.112.99",
    "IsLocalhost": "false",
    "request_header": {
      "Cookie": "__utma=156806566.949140694.1455798901.1455798901.1455798901.1; __utmc=156806566; __utmz=156806566.1455798901.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)",
      "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
      "Upgrade-Insecure-Requests": "1",
      "X-Forwarded-Proto": "http",
      "Connection": "close",
      "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36",
      "X-Forwarded-For": "103.43.112.99",
      "Host": "loklak.org",
      "Accept-Encoding": "gzip, deflate, sdch",
      "Accept-Language": "en-US,en;q=0.8",
      "X-Real-IP": "103.43.112.99"
    }
  }
}
Settings of the loklak server (strictly only for localhost clients)

Using the class method settings() to returns a json of the settings being used by the loklak server

Hello test - Check if the server is responding properly and is online

Using the object created above l.hello() returns a json response of the server status

When the server is online, the json should read

{"status": "ok"}
Peers - API To find out the loklak peers

Finding the list of loklak peers, use the object created above l.peers() which returns a json response containing all the peers connected to loklak.org

Users API

What this can do ?

  • Fetch the details of one user
  • Fetch the details of the user along with number of their followes and following
  • Fetch only the followers / following of a particular user

Query Structure: l.user(<username>, <followers count>, <following count>)

<username> is a string, e.g. 'loklak_app'
<followers count> and <following count> is a numeric or a string or None
e.g.
1. l.user('loklak_app')
2. l.user('loklak_app', 1000) - 1000 followers of loklak_app
3. l.user('loklak_app', 1000, 1000) - 1000 followers and following of loklak_app
4. l.user('loklak_app', None, 1000) - 1000 following of loklak_app
Accounts API

LOCALHOST ONLY, Loklak server running on port localhost:9000

To query the user account details of the data within the loklak server, use l.account('name') where 'name' is the screen_name of the user whose information is required.

To update the user details within the server, package a json object with the following parameters and other parameters which needs to be pushed to the server and use the action=update where action is the 2nd parameter of the account() api

l.account('name', 'update', '{ json object }')

Search API

Public search API for the scraped tweets from Twitter.

Query structure: search('querycontent', 'since date', 'until date', 'from a specific user', '# of tweets')

e.g. l.search('doctor who')

A search result in json looks as follows.

{
  "search_metadata" : {
    "itemsPerPage" : "100",
    "count" : "100",
    "count_twitter_all" : 0,
    "count_twitter_new" : 100,
    "count_backend" : 0,
    "count_cache" : 97969,
    "hits" : 97969,
    "period" : 18422,
    "query" : "doctor who",
    "client" : "103.43.112.99",
    "time" : 4834,
    "servicereduction" : "false",
    "scraperInfo" : "http://kaskelix.de:9000,local"
  },
  "statuses" : [ {
    "created_at" : "2015-03-03T19:30:43.000Z",
    "screen_name" : "exanonym77s",
    "text" : "check #DoctorWho forums #TheDayOfTheDoctor #TheMaster @0rb1t3r http://www.thedoctorwhoforum.com/ https://pic.twitter.com/FvW6J9WMCw",
    "link" : "https://twitter.com/ronakpw/status/572841550834737152",
    "id_str" : "572841550834737152",
    "source_type" : "TWITTER",
    "provider_type" : "SCRAPED",
    "retweet_count" : 0,
    "favourites_count" : 0,
    "hosts" : [ "www.thedoctorwhoforum.com", "pic.twitter.com" ],
    "hosts_count" : 2,
    "links" : [ "http://www.thedoctorwhoforum.com/", "https://pic.twitter.com/FvW6J9WMCw" ],
    "links_count" : 2,
    "mentions" : [ "@0rb1t3r" ],
    "mentions_count" : 1,
    "hashtags" : [ "DoctorWho", "TheDayOfTheDoctor", "TheMaster" ],
    "hashtags_count" : 3,
    "without_l_len" : 62,
    "without_lu_len" : 62,
    "without_luh_len" : 21,
    "user" : {
      "name" : "Example User Anyone",
      "screen_name" : "exanonym77s",
      "profile_image_url_https" : "https://pbs.twimg.com/profile_images/567071565473267713/4hiyjKkF_bigger.jpeg",
      "appearance_first" : "2015-03-03T19:31:30.269Z",
      "appearance_latest" : "2015-03-03T19:31:30.269Z"
    }
  }, ...
  ]
}

Mentioning the Since and Until dates

e.g. l.search('sudheesh001', '2015-01-10', '2015-01-21')

Which results in a json as follows

{
 "search_metadata" : {
    "itemsPerPage" : "100",
    "count" : "100",
    "count_twitter_all" : 0,
    "count_twitter_new" : 100,
    "count_backend" : 0,
    "count_cache" : 97969,
    "hits" : 97969,
    "period" : 18422,
    "query" : "doctor who",
    "client" : "103.43.112.99",
    "time" : 4834,
    "servicereduction" : "false",
    "scraperInfo" : "http://kaskelix.de:9000,local"
  },
  "statuses" : [ {
    "timestamp" : "2016-05-11T16:53:46.615Z",
    "created_at" : "2016-05-11T16:52:59.000Z",
    "screen_name" : "BelleRinger1",
    "text" : "I would love to see http://www.cultbox.co.uk/?p=53662",
    "link" : "https://twitter.com/BelleRinger1/status/730440578031190016",
    "id_str" : "730440578031190016",
    "source_type" : "TWITTER",
    "provider_type" : "SCRAPED",
    "retweet_count" : 0,
    "favourites_count" : 0,
    "images" : [ ],
    "images_count" : 0,
    "audio" : [ ],
    "audio_count" : 0,
    "videos" : [ ],
    "videos_count" : 0,
    "place_name" : "",
    "place_id" : "",
    "place_context" : "ABOUT",
    "hosts" : [ "www.cultbox.co.uk" ],
    "hosts_count" : 1,
    "links" : [ "http://www.cultbox.co.uk/?p=53662" ],
    "links_count" : 1,
    "mentions" : [ ],
    "mentions_count" : 0,
    "hashtags" : [ ],
    "hashtags_count" : 0,
    "classifier_language" : "english",
    "classifier_language_probability" : 6.95489E-8,
    "without_l_len" : 19,
    "without_lu_len" : 19,
    "without_luh_len" : 19,
    "user" : {
      "screen_name" : "BelleRinger1",
      "user_id" : "2497345790",
      "name" : "Belle Gaudreau",
      "profile_image_url_https" : "https://pbs.twimg.com/profile_images/723262970805907456/RbMnyEqs_bigger.jpg",
      "appearance_first" : "2016-05-11T16:53:46.615Z",
      "appearance_latest" : "2016-05-11T16:53:46.615Z"
    }
  }, ...
  ]
}

Valid parameters for since and until can also be None or any YMD date format. Looking towards the future releases to resolve this to any date format.

The from a specific user parameter makes sure that the results obtained for the given query are only from a specific user.

e.g. l.search('doctor who', '2015-01-10', '2015-01-21','0rb1t3r')

The # of tweets parameter is how many tweets will be returned.

e.g. l.search('avengers', None, None, 'Iron_Man', 3)

Aggregations API
GeoLocation API

Loklak allows you to fetch required information about a country or city.

e.g. l.geocode(['Barcelona']), l.geocode(['place1', 'place2'])