Skip to content

Commit

Permalink
Merge ff642a5 into e1ab9a0
Browse files Browse the repository at this point in the history
  • Loading branch information
desilinguist committed Mar 13, 2020
2 parents e1ab9a0 + ff642a5 commit b5dec1a
Show file tree
Hide file tree
Showing 29 changed files with 1,934 additions and 511 deletions.
3 changes: 2 additions & 1 deletion .travis.yml
Expand Up @@ -9,10 +9,11 @@ notifications:
env:
global:
- COVERALLS_PARALLEL=true
- BINPATH=${HOME}/miniconda3/envs/rsmenv/bin
matrix:
- TESTFILES="tests/test_experiment_rsmtool_1.py"
- TESTFILES="tests/test_comparer.py tests/test_configuration_parser.py tests/test_experiment_rsmtool_2.py"
- TESTFILES="tests/test_analyzer.py tests/test_experiment_rsmeval.py tests/test_fairness_utils.py tests/test_prmse_utils.py tests/test_container.py tests/test_test_utils.py"
- TESTFILES="tests/test_analyzer.py tests/test_experiment_rsmeval.py tests/test_fairness_utils.py tests/test_prmse_utils.py tests/test_container.py tests/test_test_utils.py tests/test_cli.py"
- TESTFILES="tests/test_experiment_rsmcompare.py tests/test_experiment_rsmsummarize.py tests/test_modeler.py tests/test_preprocessor.py tests/test_writer.py tests/test_experiment_rsmtool_3.py"
- TESTFILES="tests/test_experiment_rsmpredict.py tests/test_reader.py tests/test_reporter.py tests/test_transformer.py tests/test_utils.py tests/test_experiment_rsmtool_4.py"
sudo: false
Expand Down
1 change: 1 addition & 0 deletions DistributeTests.ps1
Expand Up @@ -43,6 +43,7 @@ elseif ($agentNumber -eq 3) {
$testsToRun = $testsToRun + "tests/test_fairness_utils.py"
$testsToRun = $testsToRun + "tests/test_prmse_utils.py"
$testsToRun = $testsToRun + "tests/test_test_utils.py"
$testsToRun = $testsToRun + "tests/test_cli.py"
}
elseif ($agentNumber -eq 4) {
$testsToRun = $testsToRun + "tests/test_experiment_rsmcompare.py"
Expand Down
1 change: 1 addition & 0 deletions doc/api.rst
Expand Up @@ -177,6 +177,7 @@ From :py:mod:`~rsmtool.transformer` Module
From :py:mod:`~rsmtool.utils` Module
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autofunction:: rsmtool.utils.commandline.generate_configuration
.. _agreement_api:
.. autofunction:: rsmtool.utils.metrics.agreement
.. _dsm_api:
Expand Down
2 changes: 1 addition & 1 deletion doc/config_rsmcompare.rst
Expand Up @@ -3,7 +3,7 @@
Experiment configuration file
"""""""""""""""""""""""""""""

This is a file in ``.json`` format that provides overall configuration options for an ``rsmcompare`` experiment. Here's an example configuration file for `rsmcompare <https://github.com/EducationalTestingService/rsmtool/blob/master/examples/rsmcompare/config_rsmcompare.json>`_.
This is a file in ``.json`` format that provides overall configuration options for an ``rsmcompare`` experiment. Here's an `example configuration file <https://github.com/EducationalTestingService/rsmtool/blob/master/examples/rsmcompare/config_rsmcompare.json>`_ for ``rsmcompare``. To make it easy to get started with ``rsmcompare``, we provide a way to automatically generate a configuration file that you can then edit based on your data and your needs. To do so, simply run ``rsmcompare generate`` at the commmand line. If you have :ref:`subgroups <subgroups_rsmtool>` in your data that you want to include in your analyses, run ``rsmcompare generate --subgroups`` instead. Next, we describe all of the ``rsmcompare`` configuration fields in detail.

There are seven required fields and the rest are all optional. We first describe the required fields and then the optional ones (sorted alphabetically).

Expand Down
2 changes: 1 addition & 1 deletion doc/config_rsmeval.rst
Expand Up @@ -3,7 +3,7 @@
Experiment configuration file
"""""""""""""""""""""""""""""

This is a file in ``.json`` format that provides overall configuration options for an ``rsmeval`` experiment. Here's an example configuration file for `rsmeval <https://github.com/EducationalTestingService/rsmtool/blob/master/examples/rsmeval/config_rsmeval.json>`_.
This is a file in ``.json`` format that provides overall configuration options for an ``rsmeval`` experiment. Here's an `example configuration file <https://github.com/EducationalTestingService/rsmtool/blob/master/examples/rsmeval/config_rsmeval.json>`_ for ``rsmeval``. To make it easy to get started with ``rsmeval``, we provide a way to automatically generate a configuration file that you can then edit based on your data and your needs. To do so, simply run ``rsmeval generate`` at the commmand line. If you have :ref:`subgroups <subgroups_eval>` in your data that you want to include in your analyses, run ``rsmeval generate --subgroups`` instead. Next, we describe all of the ``rsmeval`` configuration fields in detail.

There are four required fields and the rest are all optional. We first describe the required fields and then the optional ones (sorted alphabetically).

Expand Down
2 changes: 1 addition & 1 deletion doc/config_rsmpredict.rst
Expand Up @@ -2,7 +2,7 @@

Experiment configuration file
"""""""""""""""""""""""""""""
This is a file in ``.json`` format that provides overall configuration options for an ``rsmpredict`` experiment. Here's an example configuration file for `rsmpredict <https://github.com/EducationalTestingService/rsmtool/blob/master/examples/rsmpredict/config_rsmpredict.json>`_.
This is a file in ``.json`` format that provides overall configuration options for an ``rsmpredict`` experiment. Here's an `example configuration file <https://github.com/EducationalTestingService/rsmtool/blob/master/examples/rsmpredict/config_rsmpredict.json>`_ for ``rsmpredict``. To make it easy to get started with ``rsmpredict``, we provide a way to automatically generate a configuration file that you can then edit based on your data and your needs. To do so, simply run ``rsmpredict generate`` at the commmand line. Next, we describe all of the ``rsmpredict`` configuration fields in detail.

There are three required fields and the rest are all optional. We first describe the required fields and then the optional ones (sorted alphabetically).

Expand Down
2 changes: 1 addition & 1 deletion doc/config_rsmsummarize.rst
Expand Up @@ -3,7 +3,7 @@
Experiment configuration file
"""""""""""""""""""""""""""""

This is a file in ``.json`` format that provides overall configuration options for an ``rsmsummarize`` experiment. Here's an example configuration file for `rsmsummarize <https://github.com/EducationalTestingService/rsmtool/blob/master/examples/rsmsummarize/config_rsmsummarize.json>`_.
This is a file in ``.json`` format that provides overall configuration options for an ``rsmsummarize`` experiment. Here's an `example configuration file <https://github.com/EducationalTestingService/rsmtool/blob/master/examples/rsmsummarize/config_rsmsummarize.json>`_ for ``rsmsummarize``. To make it easy to get started with ``rsmsummarize``, we provide a way to automatically generate a configuration file that you can then edit based on your data and your needs. To do so, simply run ``rsmsummarize generate`` at the commmand line. Next, we describe all of the ``rsmsummarize`` configuration fields in detail.

There are two required fields and the rest are all optional. We first describe the required fields and then the optional ones (sorted alphabetically).

Expand Down
2 changes: 1 addition & 1 deletion doc/config_rsmtool.rst
Expand Up @@ -3,7 +3,7 @@
Experiment configuration file
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is a file in ``.json`` format that provides overall configuration options for an ``rsmtool`` experiment. Here's an example configuration file for `rsmtool <https://github.com/EducationalTestingService/rsmtool/blob/master/examples/rsmtool/config_rsmtool.json>`_.
This is a file in ``.json`` format that provides overall configuration options for an ``rsmtool`` experiment. Here's an `example configuration file <https://github.com/EducationalTestingService/rsmtool/blob/master/examples/rsmtool/config_rsmtool.json>`_ for ``rsmtool``. To make it easy to get started with ``rsmtool``, we provide a way to automatically generate a configuration file that you can then edit based on your data and your needs. To do so, simply run ``rsmtool generate`` at the commmand line. If you have :ref:`subgroups <subgroups_rsmtool>` in your data that you want to include in your analyses, run ``rsmtool generate --subgroups`` instead. Next, we describe all of the ``rsmtool`` configuration fields in detail.

There are four required fields and the rest are all optional. We first describe the required fields and then the optional ones (sorted alphabetically).

Expand Down
6 changes: 3 additions & 3 deletions rsmtool/__init__.py
Expand Up @@ -13,7 +13,7 @@
import warnings

try:
import rsmextra
import rsmextra # noqa
except ImportError:
HAS_RSMEXTRA = False
else:
Expand All @@ -22,13 +22,13 @@
from .version import __version__

if HAS_RSMEXTRA:
from rsmextra.version import __version__ as rsmextra_version
from rsmextra.version import __version__ as rsmextra_version # noqa
VERSION_STRING = '%(prog)s {}; rsmextra {}'.format(__version__,
rsmextra_version)
else:
VERSION_STRING = '%(prog)s {}'.format(__version__)

from .analyzer import Analyzer
from .analyzer import Analyzer # noqa

from .convert_feature_json import convert_feature_json_file # noqa

Expand Down
30 changes: 20 additions & 10 deletions rsmtool/configuration_parser.py
Expand Up @@ -282,15 +282,22 @@ def __len__(self):

def __str__(self):
"""
Return string representation of the object keys
as comma-separated list.
Return a string representation of the underlying configuration
dictionary.
Returns
-------
config_names : str
A comma-separated list of names from the config dictionary.
config_string : str
A string representation of the underlying configuration
dictionary as encoded by ``json.dumps()``. It only
includes the configuration options that can be set by
the user.
"""
return ', '.join(self._config)
expected_fields = (CHECK_FIELDS[self._context]['required'] +
CHECK_FIELDS[self._context]['optional'])

output_config = {k: v for k, v in self._config.items() if k in expected_fields}
return json.dumps(output_config, indent=4, separators=(',', ': '))

def __iter__(self):
"""
Expand Down Expand Up @@ -563,12 +570,8 @@ def save(self, output_dir=None):
context = self._context
outjson = output_dir / f"{experiment_id}_{context}.json"

expected_fields = (CHECK_FIELDS[self._context]['required'] +
CHECK_FIELDS[self._context]['optional'])

output_config = {k: v for k, v in self._config.items() if k in expected_fields}
with outjson.open(mode='w') as outfile:
json.dump(output_config, outfile, indent=4, separators=(',', ': '))
outfile.write(str(self))

def check_exclude_listwise(self):
"""
Expand Down Expand Up @@ -827,6 +830,8 @@ def __init__(self, pathlike):
Raises
------
FileNotFoundError
If the given config file path does not exist.
ValueError
If the configuration file does not have a valid extension.
Valid extensions are ``.json`` and ``.cfg``.
Expand All @@ -835,6 +840,11 @@ def __init__(self, pathlike):
if isinstance(pathlike, str):
pathlike = Path(pathlike)

# raise an exception if the file does not exist
if not pathlike.exists():
raise FileNotFoundError(f"The configuration file {pathlike} "
"was not found.")

# make sure have either a JSON or CFG configuration file
extension = pathlike.suffix.lower()
if extension not in ['.json', '.cfg']:
Expand Down
10 changes: 5 additions & 5 deletions rsmtool/reader.py
Expand Up @@ -227,7 +227,7 @@ def read_from_file(filename, converters=None, **kwargs):
------
ValueError
If the file has an extension that we do not support
pd.parser.CParserError
pandas.errors.ParserError
If the file is badly formatted or corrupt.
Note
Expand Down Expand Up @@ -263,10 +263,10 @@ def read_from_file(filename, converters=None, **kwargs):
warnings.filterwarnings('ignore', category=pd.io.common.DtypeWarning)
try:
df = do_read(filename, **kwargs)
except pd.parser.CParserError:
raise pd.parser.CParserError('Cannot read {}. Please check that it is '
'not corrupt or in an incompatible format. '
'(Try running dos2unix?)'.format(filename))
except pd.errors.ParserError:
raise pd.errors.ParserError('Cannot read {}. Please check that it is '
'not corrupt or in an incompatible format. '
'(Try running dos2unix?)'.format(filename))
return df

@staticmethod
Expand Down
85 changes: 47 additions & 38 deletions rsmtool/rsmcompare.py
Expand Up @@ -10,22 +10,17 @@
:organization: ETS
"""


import argparse
import glob
import logging
import os
import sys

from os.path import (abspath,
exists,
join,
normpath)
from os.path import abspath, exists, join, normpath

from . import VERSION_STRING
from .configuration_parser import configure
from .reader import DataReader
from .reporter import Reporter
from .utils.commandline import generate_configuration, setup_rsmcmd_parser
from .utils.constants import VALID_PARSER_SUBCOMMANDS
from .utils.logging import LogFormatter


Expand Down Expand Up @@ -84,8 +79,9 @@ def run_comparison(config_file_or_obj_or_dict, output_dir):
Raises
------
ValueError
If any of the required fields are missing or ill-specified.
FileNotFoundError
If either of the two input directories in ``config_file_or_obj_or_dict``
do not exist, or if the directories do not contain rsmtool outputs at all.
"""

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -180,44 +176,57 @@ def main():
# set up the basic logging configuration
formatter = LogFormatter()

handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(formatter)
# we need two handlers, one that prints to stdout
# for the "run" command and one that prints to stderr
# from the "generate" command; the latter is necessary
# because do not want the warning to show up in the
# generated configuration file
stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setFormatter(formatter)

logging.root.addHandler(handler)
logging.root.setLevel(logging.INFO)
stderr_handler = logging.StreamHandler(sys.stderr)
stderr_handler.setFormatter(formatter)

# get a logger
logging.root.setLevel(logging.INFO)
logger = logging.getLogger(__name__)

# set up an argument parser
parser = argparse.ArgumentParser(prog='rsmcompare')

parser.add_argument('config_file', help="The JSON configuration file for "
"this comparison")
# set up an argument parser via our helper function
parser = setup_rsmcmd_parser('rsmcompare',
uses_output_directory=True,
uses_subgroups=True)

# if the first argument is not one of the valid sub-commands
# or one of the valid optional arguments, then assume that they
# are arguments for the "run" sub-command. This allows the
# old style command-line invocations to work without modification.
if sys.argv[1] not in VALID_PARSER_SUBCOMMANDS + ['-h', '--help',
'-V', '--version']:
args_to_pass = ['run'] + sys.argv[1:]
else:
args_to_pass = sys.argv[1:]
args = parser.parse_args(args=args_to_pass)

parser.add_argument('output_dir', nargs='?', default=os.getcwd(),
help="The output directory where the report "
"files for this comparison will be stored")
# call the appropriate function based on which sub-command was run
if args.subcommand == 'run':

parser.add_argument('-V', '--version', action='version',
version=VERSION_STRING)
# when running, log to stdout
logging.root.addHandler(stdout_handler)

# parse given command line arguments
args = parser.parse_args()
logger.info('Output directory: {}'.format(args.output_dir))
# run the experiment
logger.info('Output directory: {}'.format(args.output_dir))
run_comparison(abspath(args.config_file),
abspath(args.output_dir))

# convert all paths to absolute to make sure
# all files can be found later
config_file = abspath(args.config_file)
output_dir = abspath(args.output_dir)
else:

# make sure that the given configuration file exists
if not exists(config_file):
raise FileNotFoundError("Main configuration file {} not "
"found.".format(config_file))
# when generating, log to stderr
logging.root.addHandler(stderr_handler)

# generate a comparison report
run_comparison(config_file, output_dir)
# auto-generate an example configuration and print it to STDOUT
configuration = generate_configuration('rsmcompare',
use_subgroups=args.subgroups,
as_string=True)
print(configuration)


if __name__ == '__main__':
Expand Down

0 comments on commit b5dec1a

Please sign in to comment.