Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

您好,数据集相关问题希望请教一下 #9

Closed
yixuan004 opened this issue Jan 26, 2022 · 1 comment
Closed

您好,数据集相关问题希望请教一下 #9

yixuan004 opened this issue Jan 26, 2022 · 1 comment

Comments

@yixuan004
Copy link

yixuan004 commented Jan 26, 2022

使用如下提供的流程和代码,以及ACE2004数据集进行处理

https://github.com/LorrinWWW/two-are-better-than-one/tree/master/datasets

① 执行python ace2json.py后生成了如下文件

image

② 执行python unify.py后在unified目录下生成了如下文件:

image

请问针对您的脚本命令./ace2004.sh ace2004_folder,以及split.py,我该使用哪一个路径作为ace2004_folder这一参数?谢谢!

@Receiling
Copy link
Owner

只需要执行 python ace2json.py (第一步) 得到json文件夹下的数据即可,
然后对 json/train 下的文件执行 split.py 脚本划分出dev数据,
接着需要手动整理出 ace2004_folder/flod1, ace2004_folder/flod2, ..., ace2004_folder/flod5 文件夹(每个文件夹下都包含train.json, dev.json, test.json),
最后执行 ace2004.sh 即可得到最终数据。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants