Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError when trying to load replay #61

Closed
stefanpantic opened this issue Jul 17, 2018 · 25 comments
Closed

AttributeError when trying to load replay #61

stefanpantic opened this issue Jul 17, 2018 · 25 comments

Comments

@stefanpantic
Copy link

I get this error whenever I try to load a Replay. I tried reinstalling and installing directly from GitHub but the error persists. Any ideas on what could be causing this?

  File "<stdin>", line 1, in <module>
  File "/home/stefan/Documents/ML/env/lib/python3.6/site-packages/sc2reader/factories/sc2factory.py", line 85, in load_replay
    return self.load(Replay, source, options, **new_options)
  File "/home/stefan/Documents/ML/env/lib/python3.6/site-packages/sc2reader/factories/sc2factory.py", line 137, in load
    return self._load(cls, resource, filename=filename, options=options)
  File "/home/stefan/Documents/ML/env/lib/python3.6/site-packages/sc2reader/factories/sc2factory.py", line 146, in _load
    obj = cls(resource, filename=filename, factory=self, **options)
  File "/home/stefan/Documents/ML/env/lib/python3.6/site-packages/sc2reader/resources.py", line 312, in __init__
    engine.run(self)
  File "/home/stefan/Documents/ML/env/lib/python3.6/site-packages/sc2reader/engine/engine.py", line 177, in run
    for new_event in (event_handler(event, replay) or []):
  File "/home/stefan/Documents/ML/env/lib/python3.6/site-packages/sc2reader/engine/plugins/context.py", line 57, in handleTargetUnitCommandEvent
    self.last_target_ability_event[event.player.pid] = event
AttributeError: 'NoneType' object has no attribute 'pid'

@dsjoerg
Copy link
Member

dsjoerg commented Jul 17, 2018

Is this a standard ladder replay? If not, that could be the problem. I'm sure there are types of replays that sc2reader needs to be tweaked to be able to handle.

@StoicLoofah
Copy link
Collaborator

Yeah, I have run into issues like this before, though it does tend to be replay-specific. Can you provide a download link for an example of a problematic replay?

@stefanpantic
Copy link
Author

I've tried several from here and they all gave the same error.
http://blzdistsc2-a.akamaihd.net/ReplayPacks/4.1.2.60604-20171220_014559-1.zip

I forgot to mention my client version (4.1.2), and Python version (3.6.5).

@StoicLoofah
Copy link
Collaborator

@stefanpantic I tried opening the file linked, and it's requesting a password to open the zip file. Any help on that?

@stefanpantic
Copy link
Author

My bad i forgot to include that. It's"iagreetotheeula".

cclauss pushed a commit to cclauss/sc2reader that referenced this issue Jul 19, 2018
There are a class of issues like ggtracker#61 where __event.player is None__ so this PR tries to detect that condition higher in the call stack.
@StoicLoofah
Copy link
Collaborator

StoicLoofah commented Jul 20, 2018

Okay, so I was able to diagnose where we're not getting the data we need. However, I have no idea how to fix it.

On the replay that I checked, the event.player == None, but event.pid == 1. When I checked, both replay.player and replay.players were empty.

Those are supposed to be populated here

def load_players(self):
# If we don't at least have details and attributes_events we can go no further
if 'replay.details' not in self.raw_data:

but replay.details does not appear in self.raw_data.

that is supposed to be loaded at

self._read_data(data_file, self._get_reader(data_file))

which ends up invoking this

try:
file_data = archive.read_file(data_file, force_decompress=True)

and there, file_data == None. It doesn't even hit the exception, though when I checked recovery_attempt() == None.

Note that there is real data here in replays that parse correctly (though it did hit the recovery_attempt case on the replay I tried), so I think this is the gap.

This is far beyond the parts of sc2reader that I have ever had to deal with, so I really have no idea what's going on here. Calling on @GraylinKim to see if he has any memory of what's going on here. Otherwise, I'm not sure where to go from here =(

@cclauss
Copy link
Collaborator

cclauss commented Jul 20, 2018

except Exception as e: is a bad practice so, would it be possible to replace Exception with ImportError (not likely to happen here) and then do a few trial runs and catch each exception that actually occurs in its own except block?

try:
    file_data = archive.read_file(data_file, force_decompress=True)
except AError as e:
    file_data = recovery_attempt()
    if file_data is None:
        raise
except BError as e:
    file_data = recovery_attempt()
    if file_data is None:
        raise
except CError as e:
    file_data = recovery_attempt()
    if file_data is None:
        raise

This way we can understand what is actually going wrong before we propose that the solution is file_data = recovery_attempt(). Once you understand the problem and have an effective solution, you can condense it back down to:

try:
    file_data = archive.read_file(data_file, force_decompress=True)
except (AError, BError, CError) as e:
    file_data = recovery_attempt()
    if file_data is None:
        raise

But we should avoid sweeping helpful information under the rug with except Exception as e:.

@StoicLoofah
Copy link
Collaborator

@cclauss definitely good practice if we can narrow in on the exception types, and would definitely encourage anyone to take a shot at it!

For this specific issue, though, i'll clarify that we aren't even hitting the exception: the function is actually running without an error. The problem is that it's returning None rather than any data. As to how it's coming to that, I have no idea =(

@Gusgus01
Copy link
Contributor

Gusgus01 commented Nov 5, 2018

@StoicLoofah I dug into it a bit further :)

If you load those replays up in Starcraft 2, the players don't have names. I'm not sure if at some point they had names and even the Starcraft client can't parse them now or they never had names.

Also peeking into the MPQArchive using mpyq, I see that the usual replay.details file is missing which would explain why replay.details does not appear in self.raw_data. However, there is a replay.details.backup file in the archive, but it would appear to be missing the player name. Also, the current BitPackedDecoder available in decoders.py manages to mostly parse the data, but not fully.

Should the package support using replay.details.backup? If so I can dig into this further and try to figure out the parsing more, but it appears there won't be a player name either way and it looks to be missing other data as well. A better course of action if we want to support these replay files might be to make it fall back to using the pid instead or some default string for each player.

@Necdilzor
Copy link

but it appears there won't be a player name either way and it looks to be missing other data as well. A better course of action if we want to support these replay files might be to make it fall back to using the pid instead or some default string for each player.

These replays were anonymized by Blizzard, removing every player's specific data. When you load them using the sc2client-api you can't see their names either and it defaults to "1" and "2" as player names.

@cclauss
Copy link
Collaborator

cclauss commented Nov 14, 2018

Can we please consider something along the lines of #62?

@StoicLoofah
Copy link
Collaborator

@Gusgus01 thanks for looking more deeply into this! My short answer is that I would love to have a good solution for this but don't have a strong opinion on how to do it.

In the general case, it sounds like we won't stumble across replays with this problem in the wild: they just happen to be how Blizzard treated this particular data set, but since they are official, that makes them pretty relevant.

As I mentioned above, I really don't understand how the archive reading logic works: most of my tweaks to sc2reader start assuming that it was unpacked properly, then working to massage that into the python class structure. I would be happy to look at any PRs that manage to avoid crashes. From there, we can talk more about how we want to fill in details e.g. using pids or placeholder data.

@lfricken
Copy link

So I wanted to try solving this myself (I really want to use this library for those replays by blizzard). I had resources.py use replay.details.backup and replay.initData.backup and it successfully loaded some stuff, including players. But there were some problems.

replay.tracker.events doesn't exist, so I can't seem to find any UnitBornEvent's

@cclauss
Copy link
Collaborator

cclauss commented Nov 20, 2018

When replay.tracker.events doesn’t exist, would it be possible for you to create a dummy event with string values set to UnitBornWorkaround or BlizzardWorkaround and numeric values set to zero?

@Gusgus01
Copy link
Contributor

I've been toying with some ways to fall back to the backup files when the main ones aren't available, but ran into similar problems as @lfricken (as well as some more cosmetic issues like some print statements messing up).

I think he is saying the replays are a lot less useful without the TrackerEvents, since they are most of the more interesting things like UnitBorn, UnitDied, UpgradeFinished. Is there another source for these events that I didn't see?

There might be some things you could accomplish like tracking the usage of abilities like MorphDrone and such in replay.game.event, but that seems like a large undertaking.

Also, looks like someone we know (you may recognize that name from issue #63) has asked this question on the blizzard forums: https://us.battle.net/forums/en/bnet/topic/20769677762
So looks to be an issue with most (if not all) replays that blizzard serves up via their replay-api.

@lfricken
Copy link

lfricken commented Nov 20, 2018

As above, the script didn't crash, but unit data is what I want from these replays, so they are otherwise useless.
replay.players(0).units is empty because there were no UnitBornEvents.

So I imagine blizzard tried to compress these replays as much as possible. (Since there are still 44gb worth, and hence their small size and lack of data). The fact that you can view these replays in sc2 means it must be possible to derive all information from these replays though.

Would it be possible to run these replays through sc2 or the api, then tell it to save the replay again, and get the expanded version?

Also noteworthy, someone on the Facebook page for StarCraft 2 ai linked me this and claimed it could get the unit data from these replays. When I viewed the pre computed database though, there was no unit data anywhere even though MSC has code for handling it, and a lot of documentation is about units.

https://github.com/wuhuikai/MSC

@Gusgus01
Copy link
Contributor

The removal of data is part of the anonymization process that Blizzard does to the replays for AI and ML use. Not sure the exact reason replay.tracker.events gets removed as part of that, possibly as you mentioned for size reasons.

I've seen MSC before. It uses s2clientprotocol (https://github.com/Blizzard/s2client-proto) and pysc2 (https://github.com/deepmind/pysc2). The former being the protocol to interact with Starcraft 2 and the latter does a lot of work to make it easier to interact with Starcraft 2 in a research oriented way. It would appear that MSC uses pysc2 to launch the Starcraft client and load up a replay from which it gets the data to analyze.

Just thinking about it briefly, I can think of a few options. If the analysis you want to do is simpler, some of it could be logic'd from the GameEvents (like did they start a specific upgrade in each of these replays, etc). You could also look at MSC and see how they retrieve the actions from replays using pysc2. If you really wanted to use SC2Reader, you could see if you can save replays with all that file once you have ran through the replay using pysc2.

@Gusgus01
Copy link
Contributor

Gusgus01 commented Nov 24, 2018

Created a PR to fallback to the backup files when the main files aren't there. Let me know what you guys think. I did not attempt to do anything about the lack of replay.tracker.events.

@lfricken
Copy link

lfricken commented Nov 25, 2018

I had tried telling pysc2 to save the replay again, but it pretty explicitly doesn't want that. It said something along the lines of "can't save a replay of a replay". And either way, doing that for 100k replays isn't really feasible since it has to replay the whole thing in 8x time at best. If I could process a replay per minute (unlikely) that would still take 70 days for 100k replays 🙁

@Gusgus01
Copy link
Contributor

Looking at MSC, it looks like that is exactly what they did. They use multiprocessing and threads to do multiple at once.

@StoicLoofah
Copy link
Collaborator

Just merged in #69 . 2 questions for thread participants:

  1. do you guys want a new tagged release version for this?
  2. i think this technically fulfills the scope of this issue, though we still don't have the trackerevents. Would you prefer we keep this issue, or should I close and create a new one?

@Gusgus01
Copy link
Contributor

  1. I'll let you decide on that one :)

  2. I definitely think this fulfills this issue. I'm not sure how to go about dealing with the trackerevents bit since it looks like it will require using pysc2 (or sc2clientprotocol) and the sc2executable in order to replay the replay. I could see maybe adding a way to sideload data and then couple that with a tool that saves off the trackerevents from a replayed replay.

@cclauss
Copy link
Collaborator

cclauss commented Nov 28, 2018

My vote would be 1) a tagged release because this is a significant improvement and 2) close this issue and open a new one.

Great work here @Gusgus01 !!

@Gusgus01
Copy link
Contributor

Thanks :)

@StoicLoofah
Copy link
Collaborator

Created #74 as a followup. Thanks @Gusgus01 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants