Make tests working on Windows #637

ftesser · 2020-11-19T08:16:40Z

This PR is related to #636.
It will be an incremental PR, since I see that after this first commit, some others changes are needed to successfully run all the tests on Windows.

This first commit remove the not used 'from torch.distributed import all_gather' from 'farm\modeling\prediction_head.py'. There is also a newline added here.

…m\modeling\prediction_head.py'

ftesser · 2020-11-19T08:25:21Z

After the first commit, I noticed this error on Windows test (note that from side the test on Linux always run successfully):

test_natural_questions.py:43: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
..\venv\lib\site-packages\farm\data_handler\data_silo.py:113: in __init__
    self._load_data()
..\venv\lib\site-packages\farm\data_handler\data_silo.py:215: in _load_data
    self.data["train"], self.tensor_names = self._get_dataset(train_file)
..\venv\lib\site-packages\farm\data_handler\data_silo.py:141: in _get_dataset
    dicts = list(self.processor.file_to_dicts(filename))
..\venv\lib\site-packages\farm\data_handler\processor.py:1383: in file_to_dicts
    dicts = read_jsonl(file, proxies=self.proxies)
..\venv\lib\site-packages\farm\data_handler\utils.py:119: in read_jsonl
    dicts = [json.loads(l) for l in open(file)]
..\venv\lib\site-packages\farm\data_handler\utils.py:119: in <listcomp>
    dicts = [json.loads(l) for l in open(file)]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <encodings.cp1252.IncrementalDecoder object at 0x00000142D4728280>
input = b'edia.org/w/index.php?title=List_of_WWE_Champions&oldid=866450897 \'\' Categories : <Ul> <Li> World heavyweight wrest...evel": false, "end_token": 5090}, {"start_token": 5090, "top_level": false, "end_token": 5204}, {"start_token": 5204, '
final = False

    def decode(self, input, final=False):
>       return codecs.charmap_decode(input,self.errors,decoding_table)[0]
E       UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1012: character maps to <undefined>

c:\python\lib\encodings\cp1252.py:23: UnicodeDecodeError

So the second commit is need to fix that.

ftesser · 2020-11-19T09:02:34Z

After the first two commits, there is only one test that fails on Windows: test.test_dpr.test_dpr_modules
The reason is:

E           AttributeError: module 'torch.distributed' has no attribute 'get_rank'

..\venv\lib\site-packages\farm\modeling\prediction_head.py:1644: AttributeError

I see that this refer to logits_to_loss method on prediction_head.py :

# Check if DDP is initialized
        try:
            rank = torch.distributed.get_rank()
        except AssertionError:
            rank = -1

Any suggestions to make this working also on windows test?
If DDP does not work on windows perhaps we can exclude this test when running on Windows?

…rk on Windows).

ftesser · 2020-11-26T10:43:11Z

My current proposal is to exclude test.test_dpr.test_dpr_modules just when pytest is run on a Windows machine.
If you think that it is possible somehow to test dpr_modules without DDP it would probably be better, but for the moment this solution allows you to test almost everything even on windows.

Let me know what do you think.

Timoeller · 2020-11-30T16:12:12Z

Hey @ftesser thanks for looking into this.
Generally I believe it doesnt make much sense having different code/tests for different distributions. It will be tedious to keep track of all the individual changes.

The DDP module is quite important for making DPR trainable so excluding it would not be preferred.

Looking into Pytorch support for DDP in windows I found this PR: pytorch/pytorch#45335 which was merged on Sep 25th. So this code is only in the newest pytorch 1.7.0 release (we are currently using pytorch 1.6.0).

So actionable insights: We want to do a FARM release later this week, after this we can increment the pytorch version and see if this fixes the DDP issues. Would that be a solution for you? Of course you can try updating pytorch beforehand yourself, we would appreciate your insights.

ftesser · 2020-12-01T09:20:12Z

Thanks @Timoeller for your feedback.
Of course if with pytorch 1.7.0 DDP will works on Windows this is the best solution, so I am ok to wait for that. I will try to updating pytorch beforehand myself.

Just a note about this PR, commit 0aedd05 solves a UnicodeDecodeError, this should be applied in all cases.

Timoeller · 2020-12-23T10:03:36Z

Sorry for the delay @ftesser
We actually made huge changes in #649 that caused a lot of testing.

We also quickly tested updating to pytorch in #660 which produced failing onnx conversion tests. Unfortunately we will be able to look into this with the start of the new year.

I will merge you PR now to include this nice patch in the coming release.

ftesser added 2 commits November 19, 2020 09:03

Removed not used 'from torch.distributed import all_gather' from 'far…

4e0c9e3

…m\modeling\prediction_head.py'

Fixed UnicodeDecodeError (Windows).

0aedd05

ftesser changed the title ~~Make tests working on Windws~~ Make tests working on Windows Nov 19, 2020

Skip test_dpr_modules for Windows (it referes to DDP that does not wo…

e6d7c78

…rk on Windows).

Timoeller mentioned this pull request Nov 30, 2020

Tests fail on Windows #636

Closed

Timoeller merged commit e7b6a2a into deepset-ai:master Dec 23, 2020

ftesser mentioned this pull request Jan 26, 2021

Re-enable test_dpr_modules also for windows. #697

Merged

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make tests working on Windows #637

Make tests working on Windows #637

ftesser commented Nov 19, 2020

ftesser commented Nov 19, 2020

ftesser commented Nov 19, 2020 •

edited

Loading

ftesser commented Nov 26, 2020

Timoeller commented Nov 30, 2020

ftesser commented Dec 1, 2020 •

edited

Loading

Timoeller commented Dec 23, 2020

Make tests working on Windows #637

Make tests working on Windows #637

Conversation

ftesser commented Nov 19, 2020

ftesser commented Nov 19, 2020

ftesser commented Nov 19, 2020 • edited Loading

ftesser commented Nov 26, 2020

Timoeller commented Nov 30, 2020

ftesser commented Dec 1, 2020 • edited Loading

Timoeller commented Dec 23, 2020

ftesser commented Nov 19, 2020 •

edited

Loading

ftesser commented Dec 1, 2020 •

edited

Loading