-
Notifications
You must be signed in to change notification settings - Fork 0
/
ACT-V1.7.config
107 lines (76 loc) · 6.74 KB
/
ACT-V1.7.config
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
#***********************************************Copyright and License ********************************************************************************************
#ACT for Accuracy of Connective Translation is a reference-based metric to measure the accuracy of discourse connective translation.
#Copyright (c) 2013 Idiap Research Institute, http://www.idiap.ch/
#Written by Najeh Hajlaoui <Najeh.Hajlaoui@idiap.ch> or <Najeh.Hajlaoui@gmail.com>
#This file is part of ACT.
#ACT is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License version 3 as published by the Free Software Foundation.
#ACT is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#You should have received a copy of the GNU General Public License along with ACT. If not, see <http://www.gnu.org/licenses/>.
#***********************************************End of Copyright and License ********************************************************************************************
#modify the following parameters to configure ACT tool. By default, these parameters are configured to assess the translation sample included in ACT ($SourceLanguage=en and $TargetLanguage=fr) without alignment ($UsingAlignment=0;).
########################## Configuration step 1 ##########################
#ACT directory
$ACTHOME=/Users/Hajlaoui/tools/ACT/ACT-V1.7;
# To run ACT, set up firstly the following parameters and go to $ACTHOME directory (perl ACT-Vx.x.pl).
#Working directory
$WorkingDir=/Users/Hajlaoui/tools/ACT/ACT-V1.7/Samples/En-Fr/WithoutAlignment;
#if $ACTHOME=$WorkingDir, it will be more practical to set up the 3 files names (without path) in the command line respecting the following order SOURCE REFERENCE CANDIDATE, in this case the priority is for the command line instruction
#set up the 3 following file name variables
# Name of the source file without path, have to be lowercased and tokenised
$SourceFile=20undoc.2000.en-fr.tok.lc.src;
# Name of the reference file without path, have to be lowercased and tokenised, (if Arabic transliterated and tokenized version)
$ReferenceFile=20undoc.2000.en-fr.tok.lc.ref;
# Name of the candidate file without path, have to be lowercased and tokenised, (if Arabic transliterated and tokenized version)
$CandidateFile=20undoc.2000.en-fr.tok.lc.cand;
#Set up source and target languages
$SourceLanguage=en;# possible values: en
$TargetLanguage=fr;#possible values: fr, ar, it, de
# Set the $case5manual and $case6manual variables to the correct numbers, by default $case5manual=0 and $case6manual=0
$case5manual=0;
$case6manual=0;
#To print details in terminal
$PrintReport=1; #possible values: 0|1
#To print case5 occurrences in terminal
$PrintCase5=0;# possible values: 0|1
########################## Configuration step 2 ##########################
#
#You can use ACT without alignment ($UsingAlignment=0) or with alignment information ($UsingAlignment=1)
$UsingAlignment=0; #possible values: 0|1
#if you choose $UsingAlignment=0 you don't then need to configure the rest of parameters, that's mean you don't need to set up the $PrintAligInfos parameter and all the paramaters of Configuration step 3.
#
#in the second case (with alignment: $UsingAlignment=1) you can choose to print or not the alignment information
$PrintAligInfos=1; #possible values: 0|1
#
########################## Configuration step 3 ##########################
#ATTENTION: YOU SHOULD CHOOSE $UsingAlignment=1, YOU HAVE THEN TO CHOOSE ONE OF THE 2 ALIGNMENT METHODS:
#1- GIZA MODEL (($GizaModel=1 and $RunGiza=1) or ($GizaModel=1 and $RunGiza=0 if you don't want to run Giza a second time)
#2- SAVING MODEL ($SavingModel=1; and set up the 2 variables $SavingModel_Src_Ref and $SavingModel_Src_Cand ),
# IF YOU CHOOSE BOTH, AN AUTOMATIC CONFIGURATION WILL CHOOSE FOR YOU THE GIZA MODEL; **********
#******** Alignment method 1: GizaModel or without Training *****************
#
#Set up the GizaModel alignment
$GizaModel=0; #possible values: 0|1
#You may run GIZA just the first time for the same data.
$RunGiza=0; #possible values: 0|1
$PathToGIZA=/Users/Hajlaoui/tools/giza-pp-v1.0.7/giza-pp/GIZA++-v2; # the same directory should contain also the binary plain2snt.out
#The GIZA's outputs will be created automatically in in a new folders ./Giza-Src_Ref/.tst.A3.final and ./Giza-Src_Cand/.tst.A3.final
#If your Giza tool is configured differently, please adapt your configuration or change your files names to the proposed ones.
#
#******** End of Alignment method 1: GizaModel or without Training **********
#******** Alignment method 2: SavingModel or with Training *****************
#
#IMPORTANT, if you choose $SavingModel=1, that's mean you have your own saving alignment system, 2 externals scripts have to be used then before using the actual ACT tool:
#1-ACT-preprocess-xx_yy-V1.4.pl for a preprocessing step of data for the saving alignment system. (xx_yy depend on the source and target languages). (for example, you can use Europarl corpus to build the saving model, which will be used to align new data)
#Between the precedent preprocessing step and the following postprocessing step, you have to align your data ((source//reference) and (source//candidate)).
#2-Symmetrization2Giza-V1.pl for a postprocessing step of the saving model alignment's output to convert it from the format produced by simmetrized giza (or Mgiza) to giza output. The alinement result of (source//reference) and (source//candidate) have to be placed (respectively with the following file names) in $WorkingDir/model/$SavingModel_Src_Ref and $WorkingDir/model/$SavingModel_Src_Cand
#
#Rq: the 2 last scripts are included in ACT. If you choose to use GIZAModel ($GizaModel=1;) of course you dont need to use these 2 scripts.
#Set up the SavingModel alignment
$SavingModel=0; #possible values: 0|1
#Set up the source_reference alignment output after conversion from simmetrized giza (or Mgiza) to giza output using the Symmetrization2Giza-V1.pl script, to placed in a new folder "model", please create a new folder "model" in $WorkingDir.
$SavingModel_Src_Ref=2100-ar-en-zh.SMTB2.Src-Ref.toact.V1.4.en-ar.grow-diag-final-and.giza;#only if $SavingModel=1
#Set up the source_candidate alignment output after conversion from simmetrized giza (or Mgiza) to giza output using the Symmetrization2Giza-V1.pl script, to placed in the same new folder "model" in $WorkingDir.
$SavingModel_Src_Cand=2100-ar-en-zh.SMTB2.Src-Cand.toact.V1.4.en-ar.grow-diag-final-and.giza;#only if $SavingModel=1,
#
#******** End of Alignment method 2: SavingModel or with Training **********
########################## End of Configuration step 3 #########################