Add example #93

fmikaelian · 2019-03-28T21:41:06Z

No description provided.

… add-example

* initial structure 🏗 * Update requirements.txt * Add config for packaging, CI and tests #11 * removing pytest to debug CI * quiet pip install * add pytest * add dummy test * Add structure for samples and examples #14 * Add Travis CI badge #16 * Implement download.py script (SQuAD fetch) #9 * Create LICENSE * Add document retriever script #8 * fix typo name title column * Move retriever example in /examples (currently in /samples) #24 * Add utils script to convert pandas df (title, content) to SQuAD #13 * fix typo * Find a name for our QA software #23 * Find a name for our QA software #23 * Upload weights and metrics and update download.py script #21 * Add run_squad.py in cdqa/reader #34 * Add scrapper folder under /cdqa #30 * Upload weights and metrics and update download.py script #21 * Find a robust method to get articles paragraphs #1 * Update converter.py * Update run_converter.py * Find a robust method to get articles paragraphs #1 * Find a robust method to get articles paragraphs #1 * Initiate README structure #32 * add contributing guidelines * precision on repo structure and workflow * small typo fixes * continue fixes * Add fetch of BNP Paribas Newsroom dataset v.1.0 to download.py #46 * small fixes download/converter * Changed url address for squad/evaluate-v1.1.py script #35 * Adapt retriever.py to BNP Paribas Newsroom dataset v.1.0 #45 * Adapt retriever.py to BNP Paribas Newsroom dataset v.1.0 #45 * Split run_squad.py in processing/train/predict #47 * Split run_squad.py in processing/train/predict #47 * add sklearn wrapper to run_squad.py to be able to call model from cdqa/pipeline * add sklearn wrapper to run_squad.py to be able to call model from cdqa/pipeline * Wrong dataset filename in examples #60 * small fixes * add BertQA.fit() code * add custom_weigths params in BertQA.fit() * typo fixes * remove uncessary args in BertQA() class * Add scikit-learn wrapper interface for BertForQuestionAnswering * update imports train/predict * json file or object for input + combine doc retriever+reader in pipeline * fix bad indents * small fixes * fix examples/features casting * read_squad_examples() does not work with our custom input object #61 * small fixes in estimator/transformer classes * Update run_squad with latest commits #64 * Split run squad.py in processing/train/predict (#66) * small fixes and updates * Adapt or disable verbose during model fit() #63 * Update run_squad.py * Update run_squad with latest commits #64 * NameError: name 'device' is not defined in predict() method #68 (#69) * #71 #65 (#73) Small fixes * #75 (#76) * fix #74 #36 #33 (#78) * continue fix best answer across paragraphs (#80) * fix #74 #36 #33 * update predict.py results * Be compliant with the Github open source guide #81 * start new structure docs * synchronise run-squad.py #82 (#83) * Disable logger info for BertProcessor() #77 (#84) * Disable logger info for BertProcessor() #77 * Adapt or disable verbose during model fit() #63 * Add comments + docstrings + changelog #79 (#86) * Add comments + docstrings + changelog #79 * Add comments + docstrings + changelog #79 * Add comments + docstrings + changelog #79 * small typo fixes * small typo in download.py * fix typo readme (#88) * Add comments + docstrings + changelog #79 (#89) * Add example (#93) * Add comments + docstrings + changelog #79 * add example notebook for prediction + small changes * add example notebook for prediction + small changes * add codecov * add codecov badge * sync with HF example (#94) * Sync hf (#98) * sync HF * update docstring * fix typo * Added download of CPU version of model to download.py (#100) * update example notebook and docstrings (#92, #90, #79) (#102) * update example notebook and docstrings (#92, #90, #79) * update docstrings #79 * continue #79 * add flake8 to pytest in CI * start integrating rest api #35 * add info readme * basic api #35 * update reqs * add refs and badges #87 (#105) * add refs and badges #87 * sync HF * first version of paper * Add sklearn wrapper for retriever as well #95 * Add sklearn wrapper for retriever as well #95 * update readme and clean repo * update evaluation section in README * debug-minor-updates (#106) * Add github badges #87 * Disable verbose during predictions #103 * fix typos and tests #95 * Rename variables and scripts #108 * adapt notebook to new retriever class (#109) * adapt notebook to new retriever class * remove samples dir * clean up repo and rename #108 * Fix predict berqa (#113) * Rename variables and scripts #108 * Rename variables and scripts #108 * BertQA().predict() should return only 1 final predictions object #110 * Created a sklearn wrapper for the QA Pipeline (#101) * Implemented QAPipeline object that do the whole process for question-answering * Added option to attribute model: path (string) or joblib object * corrected typo * Created example of jupyter notebook for use of qa_pipeline * Update notebook example * Added description of QAPipeline class" * Added descriptions to all methods of QAPipeline class" * Corrected typo * Added download of CPU version of model to download.py (#100) * update example notebook and docstrings (#92, #90, #79) (#102) * update example notebook and docstrings (#92, #90, #79) * update docstrings #79 * continue #79 * add flake8 to pytest in CI * start integrating rest api #35 * add info readme * basic api #35 * update reqs * add refs and badges #87 (#105) * add refs and badges #87 * sync HF * first version of paper * Add sklearn wrapper for retriever as well #95 * Add sklearn wrapper for retriever as well #95 * update readme and clean repo * update evaluation section in README * debug-minor-updates (#106) * Add github badges #87 * Disable verbose during predictions #103 * fix typos and tests #95 * Rename variables and scripts #108 * adapt notebook to new retriever class (#109) * adapt notebook to new retriever class * remove samples dir * clean up repo and rename #108 * Fix predict berqa (#113) * Rename variables and scripts #108 * Rename variables and scripts #108 * BertQA().predict() should return only 1 final predictions object #110 * Implemented QAPipeline object that do the whole process for question-answering * Added option to attribute model: path (string) or joblib object * corrected typo * Created example of jupyter notebook for use of qa_pipeline * Update notebook example * Added description of QAPipeline class" * Added descriptions to all methods of QAPipeline class * Corrected typo * Changed code from qa_pipeline.py to cdqa_sklearn.py * seperated kwargs for declaration of different classes within QAPipeline * removed qa_pipeline.py * Implemented predict() and retriever part of fit() * Implemented reader training in fit() and completed documentation * Modified documentation for predict() method * Deleted useless tutorial * Created notebook example for pipeline * Modified converter.py to correct for the creation of repeated articles… (#116) * Modified converter.py to correct for the creation of repeated articles in generate_squad_examples * included options for min and max length in filter_paragraphs() * Implement automatic pypi upload on master release #107 * Debug, small fixes and doc updates (#117) * Update CONTRIBUTING.md with new tree structure #112 * Build REST API using QAPipeline() #118 * start updating README with cdqa pipeline method * Allow for model evaluation directly from cdqa #104 * add filter script in utils for all data cleaning tasks * update badges pypi * Allow for model evaluation directly from cdqa #104 * api style fixing + update demo notebook * Build REST API using QAPipeline() #118 * update README and naming * update README * update tree structure * corrected typo related to sync with HF (#126) * Updated BertQA to enable multiple trainings and handled some errors (#130) * modified BertQA class to enable multiple calls to fit() * cerrected typo * Deleted tokenizer saving inside BertQA.fit * handled problem with self.output_dir * Implemented fit_reader() method and fixed fit() method. (#131) * replaced self.model by self.reader * Implemented fit_reader(), fixed fit() and updated doc * sync HF + auto export json in scrapper + move filters (#129) * sync HF + auto export json in scrapper + move filters * change wording converter => converters * fix typo api * update version of dataset * update version of dataset (2) * predict() method should also give back index of document + paragraph #91 (#132) * update API and notebook * Implemented multiple prediction in QAPipeline.predict() (#135) * replaced self.model by self.reader * Implemented fit_reader(), fixed fit() and updated doc * -- * Implemented multiple predictions in qa_pipeline * removed not used import * Improved doc of predict method * Handled error for predictions on GPU * filter paragraphs script (#140) * implemented methods to send reader to GPU or CPU inside QAPipeline (#143) * debug and update filters (#141) * small fixes * Fixed some errors (#145) * fixed typo * added other needed changes when sending to different devices * add instructions for reader training on SQuAD * Put all the display of messages under the verbose condition (#147) * Update issue templates (#148) * Update issue templates * remove old issue template method * Be compliant with the Github open source guide #81 * Deleted not used folders and included option to save logs with default False (#150) * Deleted useless folders for github repo * Added option in BertQA to save logs, the default is false * Implemented a better way to have the option to save logs * Updated documentation * Removed useless parameter in BertQA * Updated explanations to train and to evaluate reader on README (#151) * Updated explanation to train reader on README file * Updated explanation to evaluate reader on README file * Added explanation to evaluate pipeline * Implemented function to evaluate pipeline (#152) * Implemented function to evaluate pipeline * modified name of module from metrics.py to evaluate.py * Changed evaluate_from_files to evaluate_reader / modified name of the module (evaluate.py to evaluation.py) * README updates * update run_squad.py * create content column inside qa_pipeline * update refs * clean up docstrings after content col removal * fix typos and start write unit tests #136 (#155) * + pdf_converter (#149) * + pdf_converter * README updates * Use the sys.argv and save the data on a csv * change '\n'.join() by ' '.join() in order to correct the csv * update run_squad.py * create content column inside qa_pipeline * update refs * clean up docstrings after content col removal * minor bugs * fix typos and start write unit tests #136 (#155) * + pdf_converter * Use the sys.argv and save the data on a csv * change '\n'.join() by ' '.join() in order to correct the csv * minor bugs * update README * change param name to sth meaningful * fix typos and start migration to org * update URLs * corrected bug when predicting with log and set do_lower_case do False as default (the default BERT model we use is uncased) (#160) * change LICENSE + fix typo README * Included the whole team in the author paramater in setup.py (#161) * Prepare repo for release #159 (#162) * Prepare repo for release #159 * add GPU saving to example of reader training on SQuAD * remove useless dep * update README * Updated download.py and removed docs repo (#163) * updated download.py * deleted docs repo * fix LICENSE badge bug (#166) * Last updates on download / setup / NB example to official release (#168) * moved download.py to root and updated it to download models and BNP Paribas dataset * changed version in setup.py to 1.0.0 * updated tutorial example with changes in repository * done some minor updates on download.py * removed PyGithub from requirements as we do not use it anymore

fmikaelian added 4 commits March 27, 2019 22:04

Add comments + docstrings + changelog #79

3747664

add example notebook for prediction + small changes

5d506ee

add example notebook for prediction + small changes

41678fc

Merge branch 'add-example' of https://github.com/fmikaelian/cdQA into…

4b1fa01

… add-example

fmikaelian merged commit 89192b0 into develop Mar 28, 2019

fmikaelian deleted the add-example branch March 28, 2019 21:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example #93

Add example #93

fmikaelian commented Mar 28, 2019

Add example #93

Add example #93

Conversation

fmikaelian commented Mar 28, 2019