Skip to content

Commit

Permalink
Refactor code to enable test of read, process, translate and write st…
Browse files Browse the repository at this point in the history
…ages. Now can deal with http and key directive. ; #6
  • Loading branch information
soccerjustinh1 committed May 12, 2019
1 parent 5ac61fa commit 5a01a03
Show file tree
Hide file tree
Showing 10 changed files with 1,206 additions and 187 deletions.
53 changes: 32 additions & 21 deletions README.md
Expand Up @@ -52,17 +52,17 @@ $ ./mediawiki.py --dir mydir/ --lang 'ru,es'

## Other notes

Overview
### Overview
Translate an input mediawiki file of Spanish and generate an output mediawiki file of English.
orig_input.txt -> script -> orgi_output.txt

Inputs
### Inputs
- input file
- Language of output file (default: Spanish)
- Language of output files can be a list of languages eg 'ru,es' would be for Russian and Spanish.
- input directory - so need to get a list of all files in that directory and then parse each one of them.

Outputs
### Outputs
- output file with the name of the file "myfile-es.txt' if the input is "myfile.txt"
- status
- success (zero) or
Expand All @@ -73,37 +73,48 @@ Outputs

### Control Flow

1. Open and read input file
2. Parse input file into a data structure
3. Send requests to Cloud Translation to perform the language conversion
4. Create and write to output file
1. Open and read input file.
1. Parse input file into a data structure.
1. Process each line one at a time.
1. For each line replace special text sequences with a symbol as we may want to translate these separately.
1. Send requests to Cloud Translation to perform the language conversion.
1. Create and write to output file.

Error conditions
#### Error conditions
- Cannot find input file
- Empty input file
- Format of input file not valid according to mediawiki
- Unable to send requests to Cloud Translation
- Unable to create output file

Data structure(s)
#### Data structure(s)

List of objects
- Object
##### List of objects
- Object - for each line in the input file.
- Original Line - Original line of text from the input file, in English, say.
- Translated Line - The final translated line of text into the requested output language
- Line number - Line number of the input file.
- Sequence - The unique sequence to indicate the special sequence that we don't want to translate.
- Original - The original text (in English, say).
- Sequence Line - After special sequences of interest within the original line have been replaced with a special squences so that we don't want to translate these.
- Sequences - List of the unique sequences in the current line, we may or may not want to translate individually.
- Empty Line - Boolean true or false so that we don't ask Google to translate an empty string.

- Object - for each unique sequence for a given line.
- sequence - This is a special sequence that looks like 123-456, say.
- original - This is the original string before any translations.
- translate - Boolean true or false if we would like to translate the sting or leave it in the original language.


### Detailed Design

#### Control Flow
- Start with the parsing of the input arguments to verify them.
- Parse over the input file
- Look at one line at a time
- Look for specific patterns of interest in the input file and if they are special then remove them from the line and replace them with a unique tag.
- Then send the remaining line to Google Cloud Translate API
- Each of the special unique tags replace them with the original content
- OR some of the special unique tags we need to still translate them but just a bit of their content
- write the line to the output file
1. Start with the parsing of the input arguments to verify them.
1. Parse over the input file
1. Look at one line at a time
1. Look for specific patterns of interest in the input file and if they are special then remove them from the line and replace them with a unique tag.
1. Then send the remaining line to Google Cloud Translate API
1. Each of the special unique tags replace them with the original content
1. OR some of the special unique tags we need to still translate them but just a bit of their content
1. write the line to the output file

#### Data Flow
- Need to add more details here.
Expand Down
419 changes: 254 additions & 165 deletions WikiParser.py

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions mediawiki.py
Expand Up @@ -205,8 +205,10 @@ def main():

mediawikiParser = WikiParser.WikiParser(inputFilename = inputArgs['<inputfilename>'],
outputLanguage = currentLang)
mediawikiParser.readProcessTranslateWrite()
else:
mediawikiParser = WikiParser.WikiParser(inputFilename = inputArgs['<inputfilename>'])
mediawikiParser.readProcessTranslateWrite()

except ValueError as exception:
# what was the details of the error.
Expand Down Expand Up @@ -249,8 +251,10 @@ def main():

mediawikiParser = WikiParser.WikiParser(inputFilename = inputDir + currentFile,
outputLanguage = currentLang)
mediawikiParser.readProcessTranslateWrite()
else:
mediawikiParser = WikiParser.WikiParser(inputFilename = inputDir + currentFile)
mediawikiParser.readProcessTranslateWrite()

# DEBUG: Report the data structure of special sequences.
# mediawikiParser.printWikiParser()
Expand Down
@@ -0,0 +1,57 @@
Enable and disable cookies that websites use to track your preferences

[[Cookies - Information that websites store on your computer|Cookies]] are stored on your computer by websites you visit and contain information such as site preferences or your login status. This article describes how to enable and disable cookies in Firefox.

__TOC__

= How do I view or change my cookie settings? =
{note}'''Note: Cookies are enabled by default in Firefox.'''{/note}

{for not fx63}
# [[T:optionspreferences]]
# Select the {menu Privacy & Security} panel and go to the '''Cookies and Site Data''' section.
# Select '''Accept cookies and site data from websites (recommended)''' to enable cookies. To disable cookies, select '''Block cookies and site data (may cause websites to break)'''.
#;{for =fx60}[[Image:Fx60Settings-CookiesAndSiteData]]{/for}{for =fx61, =fx62}[[Image:Fx61settings-CookiesAndSiteData]]{/for}
#* If you are troubleshooting problems with websites, make sure that '''Accept third-party cookies and site data''' is NOT set to '''Never'''. For more information, see [[Disable third-party cookies in Firefox to stop some types of tracking by advertisers]].
# Choose how long cookies are allowed to be stored:
#* Keep until:<br>{for =fx60}'''they expire'''{/for}{for fx61}'''They expire'''{/for}: Each cookie will be removed when it reaches its expiration date, which is set by the site that sent the cookie.<br>{for =fx60}'''I close Firefox'''{/for}{for fx61}'''Firefox is closed'''{/for}: The cookies that are stored on your computer will be removed when Firefox is closed.
# [[T:closeOptionsPreferences]]
{/for}
{for =fx63, =fx64}
# [[T:optionspreferences]]
# Select the {menu Privacy & Security} panel and go to the '''Cookies and Site Data''' section.
#;[[Image:Fx63settings-AcceptCookies]]
# Select '''Accept cookies and site data''' to enable cookies. To disable cookies, select '''Block cookies and site data''' and use the drop-down menu next to '''Type blocked''' to choose the type of cookies to block.
#* If you are troubleshooting problems with websites, make sure that '''Accept third-party cookies and site data''' is NOT set to '''Never'''. For more information, see [[Disable third-party cookies in Firefox to stop some types of tracking by advertisers]].
# Choose how long cookies are allowed to be stored:
#* Keep until:<br>'''They expire''': Each cookie will be removed when it reaches its expiration date, which is set by the site that sent the cookie.<br>'''Firefox is closed''': The cookies that are stored on your computer will be removed when Firefox is closed.
# [[T:closeOptionsPreferences]]
{/for}

{for fx65}
Click the menu button [[Image:fx57menu]] and select {menu Content Blocking}. The {menu Privacy & Security} panel of Firefox [[T:optionsorpreferences]] will open. This is where you can view your settings for '''Content Blocking''', which includes cookies.
;[[Image:fx65ContentBlocking]]
*If {menu Standard} is selected, it means that you are using the default settings for content blocking and cookies are enabled.

'''To block cookies''':
Select {menu Custom} and check mark '''Cookies'''.
;[[Image:Fx65Custom-ThirdPartyCookies]]
''Third-Party Trackers'' is the default setting for blocking cookies. Use the drop-down menu to change the type of cookies blocked. Note that disabling cookies can cause problems with websites. See [[Disable third-party cookies in Firefox to stop some types of tracking by advertisers|this article]] to learn more about third-party cookies.

'''To enable all cookies''', do one of the following:
* Select {menu Custom} and clear the '''Cookies''' check mark.
* Alternatively, select {menu Standard} to restore the default settings.

To learn more about these settings, see the [[Content blocking]] article.

= Clear cookies when you close Firefox =
To remove all cookies and site data when Firefox is closed:
#[[T:OptionsPreferences]]
#Select the {menu Privacy & Security} panel and go to the '''Cookies and Site Data''' section.
#;[[Image:Fx65CookiesAndSiteData]]
# Check mark ''Delete cookies and site data when Firefox is closed''.
Each time you close Firefox, the cookies that are stored on your computer will be removed.
{/for}

= Websites report cookie errors =
If a website gives you an error message that cookies must be enabled, make sure that you have not blocked cookies for the website. See [[Block websites from storing cookies and site data in Firefox]] and [[Websites say cookies are blocked - Unblock them]] for more information.

0 comments on commit 5a01a03

Please sign in to comment.