Skip to content

CogWorks/LoL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LoL v1.0.2

In order for this code to work, you must have mysql installed. On mac, the easiest way to do this is through brew,

$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

$ brew install mysql

Make copy of credentials-template.py and rename it credentials.py create a file called skiplist.tsv

you can run LoL.py from command line to do certain things. however it is primarily optimized for importing as a module python LoL.py -s [True/False]

-s : option declares whether ssh is being used or not

To Import as Module:

clone the repo. change directory to the repo.

$ python setup.py install

$ easy_install LoL

if "easy_install LoL" does not immediately work, try: $ pip install .

then you should be able to add the following line to gain access to the lol module from lol import LoL

Troubleshooting Install:

may need to run: $ sudo apt-get install libmysqlclient-dev if you get "EnvironmentError: mysql_config not found"

if your machine does not have cryptography: For Debian and Ubuntu, the following command will ensure that the required dependencies are installed:

$ sudo apt-get install build-essential libssl-dev libffi-dev python-dev For Fedora and RHEL-derivatives, the following command will ensure that the required dependencies are installed:

$ sudo yum install gcc libffi-devel python-devel openssl-devel You should now be able to build and install cryptography with the usual

$ pip install cryptography

USAGE:

LoLS = LoL.Scraper(ssh = True, use_curses = False, rate_limiting = False, skipfile = False)

##setting ssh = True uses the SSH = True ssh tunneling protocol from credentials.py, whereas False treats it as if its local host connection ##setting use_curses = True will print through curses rather than print.. ##setting rate_limiting = True will check to see if the key can be used before ever call. ##setting rate = "Fast" will change rate Limits to fast rate limit (1500 requests 10 seconds && 90000 requests 600 seconds) ## default of slow (10 requests 10 seconds, 500 requests 600 seconds) ##setting skipfile to a location will have the module version look for the skiplist.tsv file you created in a location you choose. Otherwise it will assume the file is in your working directory.

# HOW TO USE.

update_table(table= "create", create = False) -------------# creates tables default is false, using create_tables, you can set this true while using any other pseudo-function below

update_table("challenger",queue = "RANKED_TEAM_5x5") -------------# pulls down all members of division I of challenger in given queue, queue is defaulted to 'RANKED_TEAM_5x5' # set queue to: # RANKED_SOLO_5x5 # RANKED_TEAM_3x3 # RANKED_TEAM_5x5

update_table("master", queue = "RANKED_TEAM_5x5") -------------# same as above, but for master division I

update_table("checkteams", teamIds= False, ignore_skiplist = False, allow_updates = False) -- teamIds takes list ; [] -------------# this checks the team table and grabs a list of all the ids. it then gets the league information for each team. # if you supply this with a list of teamIds it will just do those. # setting ignore_skiplist = True will ignore the list that generates whenever the code skips an entry because of no data (this feature was added because of the RANKED_TEAM_5x5 fiasco) # setting allow_updates = True, will allow you ignore the skiplist and will ignore the 'existing entries' and re-call every team in teamIds. This is used to add a new entry if their ranking is different than in the database.

update_table("team", teamIds=False, checkTeams=False) -- teamIds takes list ; [] -------------# this is the primary mechanism, it grabs all the team ids from the by-league table and gets all the information from the team api # it then sorts it into team, team-history, and team-roster tables. # Additionally, this table can be supplied with pretty much any length of team ids and it will iterate through those instead # setting checkTeams to true, automatically calls the update_table("checkteams") function; you can pass any variables you'd pass to checkteams and it will pass through this. (but use default teamIds variable)

update_table("iterate",iteratestart=1, iterate=100, checkTeams=False) -------------# give this function a starting id and it will search for all team-ids associated with that id. # it will do this [iterate] number of times. Once the list is compiled, it sends it through update_table("team") # optionally you can set it so that it will also automatically run update_table("checkteams") to verify [by setting checkTeams=True]. # Note: if you want to start at id "1" and end at id "100" you would need to set iteratestart=1, iterate=100 # This function will now iterate through keys in the same way check teams does

update_table("all") -------------# this function will cycle from updating challenger -> master -> team -> checkteam, will not do iterate # this function takes all variables that challenger, master, team, and checkteam take. with the exception of "teamIds", that uses default.

update_table("match", matchIds=False, timeline=False, timeline_update=False, allow_updates=False) -- matchIds takes list ; [] -------------# this function will import all non-timeline data from a given list of matchIds. if no matchIds are supplied, it will automatically search through the list of matchIds in 'team-history' # timeline=True will now import timeline data too # timeline and regular match data are treated differently. The function makes a skiplist for existing matches, and a separate one for existing timelines. all non-timeline data will be # skipped if the matchid is in the matches table. However, timeline data will still be processed if timeline is set to true. # timeline_update = True will override the default 'search through team history' and only update matches that have been collected and are missing timelines. # allow_updates=True in conjunction with either providing a list of matchIds or not, will no longer exclude existing matches. This is primarily being used to fix timeline info (no long insert into, now replace into)

update_table("membertiers", matchIds=False, ignore_skiplist = False, allow_updates = False) -- matchIds takes list ; [] -------------# this function is essentially the same as the 'checkteams' functionality however this will search a given match and scrape the league data for all the players in that match # if you want to just do all the matches in the database (match table), don't set matchIds to anything. # setting ignore_skiplist = True will ignore the list that generates whenever the code skips an entry because of no data (this feature was added because of the RANKED_TEAM_5x5 fiasco) # setting allow_updates = True, will allow you ignore the skiplist and will ignore the 'existing entries' and re-call every team in teamIds. This is used to add a new entry if their ranking is different than in the database.

update_table("individualhistory", summonerIds = False, just_teams = True, allow_updates = False, season=None, end_time=None) -- summonerIds takes list ; [] -------------# this function adds individual history to the individual_history table. you can supply it with a list of summonerIds if you wish # additionally, if you do not supply with summonerIds, the function takes a look at the just_teams option. if just_teams is set to true # we query all summoner ids in the 'team roster' list and update all of them. if just_teams is False, we use all the summoner ids in the match_participants table [much longer] # allow_updates = True, will no longer skip 'existing entries' so that you can check to see if any people have played more matches. # season must be set to "SEASON3", "SEASON2014", "SEASON2015", "SEASON2016", "PRESEASON3", "PRESEASON2014", "PRESEASON2015", "PRESEASON2016", or None. # end_time must be millisecond level epoch time -- 1457111882000 for example

update_table("stats", summonerIds = False, season=None, allow_updates = False) -- summonerIds takes list ; [] -------------# this function adds the summary 'stats' stats table. you can supply it with a list of summonerIds if you wish. # additionally, if you do not supply with summonerIds, the function takes at the summoner_list table (all the individual histories that have been updated) # season must be set to, None or "ALL" [they are equivalently treated by the function], other accepted values are "3", "2014", "2015", and "2016". # season also supports lists of seasons. you can supply season = ["3", "2015"], and only those will be run. Any invalid entries will be discarded # allow_updates = True is not fully supported yet, please don't use.

## FOR ANY "TABLE" in the UPDATE_TABLE() FUNCTION ~~~~~~~~~~~~~~~~~~ # Hangwait -- default False # setting hangwait="true" enables the function to keep attempting the call until it is allowed through with the current key. # Hangwait is default false, but that only applies if there is only 1 key in your credentials file. # Feedback -- default "all" # setting feedback="all" will print all errors, print a completion statement when a step is finished, and print updates # setting feedback="quiet" will print only uncommon problem errors (duplicate entry errors are silenced), and will print completion statements when long steps are finished # setting feedback="silent" will suppress all printing # suppress_duplicates -- default False # setting suppress_duplicates=True will suppress printing of duplicate entry errors. this only effects feedback="all" as the script overrides this setting for quiet and silent modes # create -- default False # setting create=True will look at the create_tables function within the Scraper class. create_tables will check if all tables in the function exist, and it will create any missing ones. # still debugging this to allow for initial database creation

## Helpful SQL Commands ~~~~~~~~~~~~~~~~~~ ## to get data size #SELECT TABLE_NAME, TABLE_ROWS, DATA_LENGTH / (1024*1024) as "Data Length in MB", INDEX_LENGTH / (1024*1024) as "Index Length in MB", (DATA_LENGTH + INDEX_LENGTH) / (1024*1024) as "Total in MB" FROM information_schema.TABLES WHERE TABLE_SCHEMA = (SELECT DATABASE()) GROUP BY TABLE_NAME ORDER BY "Total in MB";

##FUTURE UPDATES;; #implement skiplist for other tables (matches, teams, ind_history [if such a case arises], etc.).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages