Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make tests working on Windows #637

Merged
merged 3 commits into from
Dec 23, 2020
Merged

Conversation

ftesser
Copy link
Contributor

@ftesser ftesser commented Nov 19, 2020

This PR is related to #636.
It will be an incremental PR, since I see that after this first commit, some others changes are needed to successfully run all the tests on Windows.

This first commit remove the not used 'from torch.distributed import all_gather' from 'farm\modeling\prediction_head.py'. There is also a newline added here.

@ftesser
Copy link
Contributor Author

ftesser commented Nov 19, 2020

After the first commit, I noticed this error on Windows test (note that from side the test on Linux always run successfully):

test_natural_questions.py:43: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
..\venv\lib\site-packages\farm\data_handler\data_silo.py:113: in __init__
    self._load_data()
..\venv\lib\site-packages\farm\data_handler\data_silo.py:215: in _load_data
    self.data["train"], self.tensor_names = self._get_dataset(train_file)
..\venv\lib\site-packages\farm\data_handler\data_silo.py:141: in _get_dataset
    dicts = list(self.processor.file_to_dicts(filename))
..\venv\lib\site-packages\farm\data_handler\processor.py:1383: in file_to_dicts
    dicts = read_jsonl(file, proxies=self.proxies)
..\venv\lib\site-packages\farm\data_handler\utils.py:119: in read_jsonl
    dicts = [json.loads(l) for l in open(file)]
..\venv\lib\site-packages\farm\data_handler\utils.py:119: in <listcomp>
    dicts = [json.loads(l) for l in open(file)]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <encodings.cp1252.IncrementalDecoder object at 0x00000142D4728280>
input = b'edia.org/w/index.php?title=List_of_WWE_Champions&oldid=866450897 \'\' Categories : <Ul> <Li> World heavyweight wrest...evel": false, "end_token": 5090}, {"start_token": 5090, "top_level": false, "end_token": 5204}, {"start_token": 5204, '
final = False

    def decode(self, input, final=False):
>       return codecs.charmap_decode(input,self.errors,decoding_table)[0]
E       UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1012: character maps to <undefined>

c:\python\lib\encodings\cp1252.py:23: UnicodeDecodeError

So the second commit is need to fix that.

@ftesser ftesser changed the title Make tests working on Windws Make tests working on Windows Nov 19, 2020
@ftesser
Copy link
Contributor Author

ftesser commented Nov 19, 2020

After the first two commits, there is only one test that fails on Windows: test.test_dpr.test_dpr_modules
The reason is:

E           AttributeError: module 'torch.distributed' has no attribute 'get_rank'

..\venv\lib\site-packages\farm\modeling\prediction_head.py:1644: AttributeError

I see that this refer to logits_to_loss method on prediction_head.py :

# Check if DDP is initialized
        try:
            rank = torch.distributed.get_rank()
        except AssertionError:
            rank = -1

Any suggestions to make this working also on windows test?
If DDP does not work on windows perhaps we can exclude this test when running on Windows?

@ftesser
Copy link
Contributor Author

ftesser commented Nov 26, 2020

My current proposal is to exclude test.test_dpr.test_dpr_modules just when pytest is run on a Windows machine.
If you think that it is possible somehow to test dpr_modules without DDP it would probably be better, but for the moment this solution allows you to test almost everything even on windows.

Let me know what do you think.

@Timoeller
Copy link
Contributor

Hey @ftesser thanks for looking into this.
Generally I believe it doesnt make much sense having different code/tests for different distributions. It will be tedious to keep track of all the individual changes.

The DDP module is quite important for making DPR trainable so excluding it would not be preferred.

Looking into Pytorch support for DDP in windows I found this PR: pytorch/pytorch#45335 which was merged on Sep 25th. So this code is only in the newest pytorch 1.7.0 release (we are currently using pytorch 1.6.0).

So actionable insights: We want to do a FARM release later this week, after this we can increment the pytorch version and see if this fixes the DDP issues. Would that be a solution for you? Of course you can try updating pytorch beforehand yourself, we would appreciate your insights.

@Timoeller Timoeller mentioned this pull request Nov 30, 2020
@ftesser
Copy link
Contributor Author

ftesser commented Dec 1, 2020

Thanks @Timoeller for your feedback.
Of course if with pytorch 1.7.0 DDP will works on Windows this is the best solution, so I am ok to wait for that. I will try to updating pytorch beforehand myself.

Just a note about this PR, commit 0aedd05 solves a UnicodeDecodeError, this should be applied in all cases.

@Timoeller
Copy link
Contributor

Sorry for the delay @ftesser
We actually made huge changes in #649 that caused a lot of testing.

We also quickly tested updating to pytorch in #660 which produced failing onnx conversion tests. Unfortunately we will be able to look into this with the start of the new year.

I will merge you PR now to include this nice patch in the coming release.

@Timoeller Timoeller merged commit e7b6a2a into deepset-ai:master Dec 23, 2020
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants