Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
7dc160c
commit 6782024
Showing
20 changed files
with
197 additions
and
53 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def br(corpus_data): | ||
def br(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up br data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def ca(corpus_data): | ||
def ca(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up ca data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def cv(corpus_data): | ||
def cv(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up cv data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def cy(corpus_data): | ||
def cy(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up cy data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def de(corpus_data): | ||
def de(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up de data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def en(corpus_data): | ||
def en(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up en data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def fr(corpus_data): | ||
def fr(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up fr data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def gaIE(corpus_data): | ||
def gaIE(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up ga-IE data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def it(corpus_data): | ||
def it(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up it data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def kab(corpus_data): | ||
def kab(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up kab data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def ky(corpus_data): | ||
def ky(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up ky data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def sl(corpus_data): | ||
def sl(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up sl data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def tr(corpus_data): | ||
def tr(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up tr data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def tt(corpus_data): | ||
def tt(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up tt data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
import pandas as pd | ||
|
||
def zhTW(corpus_data): | ||
def zhTW(sentence): | ||
"""Cleans up the passed sentence, removing or reformatting invalid data. | ||
Args: | ||
sentence (str): Sentence to be cleaned up. | ||
Returns: | ||
(str): Cleaned up sentence. | ||
""" | ||
# TODO: Clean up zh-TW data | ||
return corpus_data | ||
return sentence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,15 @@ | ||
def sample_size(train_size): | ||
z_score = 2.58 # Corresponds to confidence level 99% | ||
def sample_size(population_size): | ||
"""Calculates the sample size. | ||
Calculates the sample size required to draw from a population size `population_size` | ||
with a confidence level of 99% and a margin of error of 1%. | ||
Args: | ||
population_size (int): The population size to draw from. | ||
""" | ||
margin_of_error = 0.01 | ||
fraction_picking = 0.50 | ||
z_score = 2.58 # Corresponds to confidence level 99% | ||
numerator = (z_score**2 * fraction_picking * (1 - fraction_picking)) / (margin_of_error**2) | ||
denominator = 1 + (z_score**2 * fraction_picking * (1 - fraction_picking)) / (margin_of_error**2 * train_size) | ||
denominator = 1 + (z_score**2 * fraction_picking * (1 - fraction_picking)) / (margin_of_error**2 * population_size) | ||
return numerator / denominator |