Logic of the scripts #2

albusdemens · 2021-01-05T16:55:15Z

Hello again, could you check if I am running the scripts in the right order? As far as I understood, the recipe is the following:

init.py --> initialise the steamids.txt file, which lists a few IDs of Steam users.
Get more data using 00_collect_steamid.py. The script keeps running until stopped.
00_collect_private_public_index.py --> collect the status of the IDs (they can be public or private).
00_collect_data.py --> collect statistics for the listed IDs which are also public.
01_clean.py --> Clean collected data, combining datasets into a single one.
02_eda.ipynb --> Do some data analysis.
03_resample.py --> Resample, to take into account that our dataset is not balanced. Considered approaches: under-sampling and over-sampling.
04_models.py --> Run and optimise ML methods for cheaters detection

Am I missing something? Also, when I launch 00_collect_steamid.py I get a NameError (name 'vacbanned_last20' is not defined). Do you know how to fix it? Thanks heaps!

The text was updated successfully, but these errors were encountered:

vh42720 · 2021-01-06T00:56:20Z

Hi Alberto, I've just got your email back from work. The script order looks good. However, omit the step 3. As noted in the script, it is for when you want to get data in one go which will not happen without a special API key. The collect_steamid.py is using 2 string variables: vacbanned_last20 and vaclist_last20. Both are websites that list the last 20 steam ids that were entered into the website to check for VAC ban status. Thus, this script will work exclusively for those websites. However, because of their requests, I cannot give out the websites in questions. Steps to fix: 1. Google the websites that give VAC ban status 2. Locate where the last 20/10 id links 3. Change the script to work with those links in questions. It should be easy with requests parsing. Secondly, you will need to play around with collecting data scripts since the steam API allows only 100k requests which is only enough for 5k-7k steamIDS at a time. If you have any questions, please let me know. Best, Vinh Hang

…

On Tue, Jan 5, 2021 at 8:55 AM Alberto Cereser ***@***.***> wrote: Hello again, could you check if I am running the scripts in the right order? As far as I understood, the recipe is the following: 1. init.py --> initialise the steamids.txt file, which lists a few IDs of Steam users. 2. Get more data using 00_collect_steamid.py. The script keeps running until stopped. 3. 00_collect_private_public_index.py --> collect the status of the IDs (they can be public or private). 4. 00_collect_data.py --> collect statistics for the listed IDs which are also public. 5. 01_clean.py --> Clean collected data, combining datasets into a single one. 6. 02_eda.ipynb --> Do some data analysis. 7. 03_resample.py --> Resample, to take into account that our dataset is not balanced. Considered approaches: under-sampling and over-sampling. 8. 04_models.py --> Run and optimise ML methods for cheaters detection Am I missing something? Also, when I launch 00_collect_steamid.py I get a NameError (name 'vacbanned_last20' is not defined). Do you know how to fix it? Thanks heaps! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHDHB3Z6OFFNTQY66XI6JITSYNAANANCNFSM4VVSFTUQ> .

albusdemens · 2021-01-13T23:25:19Z

Hi Vinh, Thank you for your reply and for the clarification! One last question: how should I define `sampling_df, sampling_score = sampling_dict.copy(), result_dict.copy()` in 04_models.py? Best, Alberto

…

On Tue, Jan 5, 2021 at 11:56 PM Vinh Hang ***@***.***> wrote: Hi Alberto, I've just got your email back from work. The script order looks good. However, omit the step 3. As noted in the script, it is for when you want to get data in one go which will not happen without a special API key. The collect_steamid.py is using 2 string variables: vacbanned_last20 and vaclist_last20. Both are websites that list the last 20 steam ids that were entered into the website to check for VAC ban status. Thus, this script will work exclusively for those websites. However, because of their requests, I cannot give out the websites in questions. Steps to fix: 1. Google the websites that give VAC ban status 2. Locate where the last 20/10 id links 3. Change the script to work with those links in questions. It should be easy with requests parsing. Secondly, you will need to play around with collecting data scripts since the steam API allows only 100k requests which is only enough for 5k-7k steamIDS at a time. If you have any questions, please let me know. Best, Vinh Hang On Tue, Jan 5, 2021 at 8:55 AM Alberto Cereser ***@***.***> wrote: > Hello again, could you check if I am running the scripts in the right > order? As far as I understood, the recipe is the following: > > 1. init.py --> initialise the steamids.txt file, which lists a few IDs > of Steam users. > 2. Get more data using 00_collect_steamid.py. The script keeps running > until stopped. > 3. 00_collect_private_public_index.py --> collect the status of the > IDs (they can be public or private). > 4. 00_collect_data.py --> collect statistics for the listed IDs which > are also public. > 5. 01_clean.py --> Clean collected data, combining datasets into a > single one. > 6. 02_eda.ipynb --> Do some data analysis. > 7. 03_resample.py --> Resample, to take into account that our dataset > is not balanced. Considered approaches: under-sampling and over-sampling. > 8. 04_models.py --> Run and optimise ML methods for cheaters detection > > Am I missing something? Also, when I launch 00_collect_steamid.py I get a > NameError (name 'vacbanned_last20' is not defined). Do you know how to > fix it? Thanks heaps! > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#2>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/AHDHB3Z6OFFNTQY66XI6JITSYNAANANCNFSM4VVSFTUQ > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACDP24AKKIY2ILH6CRGS7DSYOYMFANCNFSM4VVSFTUQ> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logic of the scripts #2

Logic of the scripts #2

albusdemens commented Jan 5, 2021

vh42720 commented Jan 6, 2021 via email

albusdemens commented Jan 13, 2021 via email

Logic of the scripts #2

Logic of the scripts #2

Comments

albusdemens commented Jan 5, 2021

vh42720 commented Jan 6, 2021 via email

albusdemens commented Jan 13, 2021 via email