Cross-lingual Vision-Language Navigation

We introduce a new dataset for Cross-Lingual Vision-Language Navigation.

Cross-lingual Room-to-Room (XL-R2R) Dataset

The XL-R2R dataset is built upon the R2R dataset and extends it with Chinese instructions. XL-R2R preserves the same splits as in R2R and thus consists of train, val-seen, and val-unseen splits with both English and Chinese instructions, and test split with English instructions only.

Data is formatted as follows:

{
  "distance": float,
  "scan": str,
  "path_id": int,
  "path": [str x num_steps],
  "heading": float,
  "instructions": [str x 3],
}

distance: length of the path in meters.
scan: Matterport scan id.
path_id: Unique id for this path.
path: List of viewpoint ids (the first is is the start location, the last is the goal location)
heading: Agents initial heading in radians (elevation is always assumed to be zero).
instructions: Three unique natural language strings describing how to find the goal given the start pose.

For the test set, only the first path_id (starting location) is included (a test server is hosted by Anderson et al. for scoring uploaded trajectories).

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cross-lingual Vision-Language Navigation

Cross-lingual Room-to-Room (XL-R2R) Dataset

About

Releases

Packages

Contributors 2

zzxslp/XL-VLN

Folders and files

Latest commit

History

Repository files navigation

Cross-lingual Vision-Language Navigation

Cross-lingual Room-to-Room (XL-R2R) Dataset

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages