diff --git a/.gitignore b/.gitignore index eed7e82a..b0334d56 100644 --- a/.gitignore +++ b/.gitignore @@ -113,5 +113,8 @@ data nohup.out # hide wa config -*nlc2cmd/remote/config.json +clai/server/plugins/nlc2cmd/remote/config.json +# hide local gitbot stuff config +clai/server/plugins/gitbot/config.json +!clai/server/plugins/gitbot/rasa/data diff --git a/README.md b/README.md index 00096c3e..ab24f7ab 100644 --- a/README.md +++ b/README.md @@ -220,7 +220,7 @@ As before, CLAI skill will not execute without your permission unless `auto` mod ## :robot: Want to build your own skills? -[`fixit`](clai/server/plugins/fix_bot)   [`nlc2cmd`](clai/server/plugins/nlc2cmd)   [`helpme`](clai/server/plugins/helpme)   [`howdoi`](clai/server/plugins/howdoi)   [`man page explorer`](clai/server/plugins/manpage_agent)   [`ibmcloud`](clai/server/plugins/ibmcloud) +[`fixit`](clai/server/plugins/fix_bot)   [`nlc2cmd`](clai/server/plugins/nlc2cmd)   [`helpme`](clai/server/plugins/helpme)   [`howdoi`](clai/server/plugins/howdoi)   [`man page explorer`](clai/server/plugins/manpage_agent)   [`ibmcloud`](clai/server/plugins/ibmcloud)   [`tellina`](clai/server/plugins/tellina)   [`dataxplore`](clai/server/plugins/dataxplore)   [`gitbot`](clai/server/plugins/gitbot) Project CLAI is intended to rekindle the spirit of AI softbots by providing a plug-and-play framework and simple interface abstractions to the Bash and its underlying operating system. Developers can access the command line through a simple `sense-act` API for rapid prototyping of newer and more complex AI capabilities. diff --git a/clai/emulator/run.gif b/clai/emulator/run.gif index 32b3ce3e..bd10154b 100644 Binary files a/clai/emulator/run.gif and b/clai/emulator/run.gif differ diff --git a/clai/emulator/stop.gif b/clai/emulator/stop.gif index 30d5e670..2d60b437 100644 Binary files a/clai/emulator/stop.gif and b/clai/emulator/stop.gif differ diff --git a/clai/server/README.md b/clai/server/README.md index 42db0402..4c4cba8e 100644 --- a/clai/server/README.md +++ b/clai/server/README.md @@ -30,7 +30,7 @@ CLAI comes with a set of orchestrators to help you get the best out of the Orche > [`threshold_orchestrator`](orchestration/patterns/threshold_orchestrator) This is similar to the `max_orchestrator` but it maintains thresholds specific to each skill, and updates them according to how the end user reacts to them. -> [`bandit_orchestrator`](orchestration/patterns/bandit_orchestrator) This learns user preferences using contextual bandits. +> [`bandit_orchestrator`](orchestration/patterns/rltk_bandit_orchestrator) This learns user preferences using contextual bandits. These are housed in the [orchestration/patterns/](orchestration/patterns) folder under packages with the same name. Follow them as examples to build your own favorite orchestration pattern. @@ -197,7 +197,7 @@ current_state_pre.command.suggested_command = clear > **Note:** The feedback is recorded in the next action since once way want to look at the follow-up to see whether the user is using a suggestion, i.e. the feedback may not always be directly tied to the user response on `y/n/e` during the current pre-process stage. This is especially the case when skills -- such as the [`nlc2cmd skill`](plugins/nlc2cmd) -- do not suggest a command that can be used directly. -Check out the `bandit_orchestrator` for an [example](orchestration/patterns/bandit_orchestrator/bandit_orchestrator.py#L82). +Check out the `bandit_orchestrator` for an [example](orchestration/patterns/rltk_bandit_orchestrator/rltk_bandit_orchestrator.py). ### Save and Load @@ -218,6 +218,10 @@ Check out the `threshold_orchestrator` for an example of [maintaining state](orc ## Related Publications and Links -> Upadhyay, S., Agarwal, M., Bounneffouf, D., & Khazaeni, Y. (2019). -A Bandit Approach to Posterior Dialog Orchestration Under a Budget. +> A Bandit Approach to Posterior Dialog Orchestration Under a Budget. +Sohini Upadhyay, Mayank Agarwal, Djallel Bounneffouf, Yasaman Khazaeni. NeurIPS 2018 Conversational AI Workshop. + +> A Unified Conversational Assistant Framework for Business Process Automation. +Yara Rizk, Abhisekh Bhandwalder, Scott Boag, Tathagata Chakraborti, Vatche Isahagian, Yasaman Khazaeni, +Falk Pollock, and Merve Unuvar. AAAI 2020 Workshop on Intelligent Process Automation. diff --git a/clai/server/orchestration/patterns/bandit_orchestrator/README.md b/clai/server/orchestration/patterns/bandit_orchestrator/README.md deleted file mode 100644 index 4bc88d6b..00000000 --- a/clai/server/orchestration/patterns/bandit_orchestrator/README.md +++ /dev/null @@ -1,14 +0,0 @@ -# Bandit-based Orchestration - -> :warning: :warning: This orchestration pattern is developed on top of IBM Research's internal `rltk` toolkit for reward-based learning and **would not run on your machine**. You are welcome to develop with your own favorite ML platform until such time `rltk` becomes open source. - -This is an illustration of an orchestration pattern that learns based on user feedback using contextual bandits. -The context is given by the active skills and their corresponding self-reported confidences, while the reward -is either received: - -+ directly if the user accepts a suggestion with a `y/n` response (e.g. for the `howdoi` or `man page explorer` skills); or -+ indirectly if they execute a command that follows the suggestion closely (e.g. for the `nlc2cmd` or `fixit` skills). - -An orchestration layer that can adapt to user interactions over time allows you to develop CLIs that are -personalized to the needs of individual users or user types, as well as deal with miscalibrated -condifences of skills. diff --git a/clai/server/orchestration/patterns/bandit_orchestrator/bandit_orchestrator.py b/clai/server/orchestration/patterns/bandit_orchestrator/bandit_orchestrator.py deleted file mode 100644 index daf4bb95..00000000 --- a/clai/server/orchestration/patterns/bandit_orchestrator/bandit_orchestrator.py +++ /dev/null @@ -1,111 +0,0 @@ -# -# Copyright (C) 2020 IBM. All Rights Reserved. -# -# See LICENSE.txt file in the root directory -# of this source tree for licensing information. -# - -""" -This example demonstrates the use of Contextual Thomspon Sampling to calibrate the -selector for CLAI skills --> https://pages.github.ibm.com/AI-Engineering/bandit-core/ -""" - -from typing import Optional, List, Union -from pathlib import Path - -import os -import numpy -# pylint: disable=import-error -import bandits - -from clai.server.orchestration.orchestrator import Orchestrator -from clai.server.command_message import State, Action, TerminalReplayMemoryComplete -from clai.server.command_message import TerminalReplayMemory - - -# pylint: disable=too-many-arguments,unused-argument -class Bandit(Orchestrator): - - def __init__(self): - super(Bandit, self).__init__() - - self._path_to_config_file = os.path.join(Path(__file__).parent.absolute(), 'config.yml') - self._agent = bandits.instantiate_from_file(self._path_to_config_file) - self._event_number = 0 - - self._selected_arm = None - self._match_threshold = 0.7 - - self.load_state() - - def get_orchestrator_state(self): - state = { - 'self': self._agent, - 'event': self._event_number - } - return state - - def load_state(self): - state = self.load() - self._agent = state.get('self', bandits.instantiate_from_file(self._path_to_config_file)) - self._event_number = state.get('event', 0) - - def choose_action(self, command: State, agent_names: List[str], - candidate_actions: Optional[List[Union[Action, List[Action]]]], - force_response: bool, pre_post_state: str) -> Optional[Action]: - - if not candidate_actions: - return None - - if isinstance(candidate_actions, Action): - candidate_actions = [candidate_actions] - - self._event_number += 1 - - # current context is a vector of confidences - # in future, extend this with more State and Agent info - context_dimension = self._agent.serialize()['policy'].dim - context_data = context_dimension * [0.0] - - for i, candidate_action in candidate_actions: - context_data[i] = self.__calculate_confidence__(candidate_action) - - context_data = numpy.asarray(context_data) - - self._selected_arm = self._agent.choose(self._event_number, context_data)[0] - - # map arm to action and agent name - try: - # index may be out of range - return candidate_actions[self._selected_arm] - # pylint: disable=bare-except - except: - return None - - def record_transition(self, prev_state: TerminalReplayMemoryComplete, - current_state_pre: TerminalReplayMemory): - - prev_state_pre = prev_state.pre_replay - prev_state_post = prev_state.post_replay - - if prev_state_pre.command.action_suggested is None or \ - prev_state_post.command.action_suggested is None or \ - prev_state_post.command.action_suggested.suggested_command == self.noop_command: - return - - reward = float(prev_state_post.command.suggested_executed) - - currently_executed_command = current_state_pre.command.command - all_the_stuff_from_last_execution = prev_state_pre.command.action_suggested.suggested_command \ - + prev_state_pre.command.action_suggested.description \ - + prev_state_post.command.action_suggested.description - - base = set(currently_executed_command.split()) - reference = set(all_the_stuff_from_last_execution.split()) - - # check how much of the current commands is contained in the stuff from last time - match_score = len(base & reference) / len(base) - reward += float(match_score > self._match_threshold) - - self._agent.observe(self._event_number, reward) - self.save() diff --git a/clai/server/orchestration/patterns/bandit_orchestrator/config.yml b/clai/server/orchestration/patterns/bandit_orchestrator/config.yml deleted file mode 100644 index aeae9450..00000000 --- a/clai/server/orchestration/patterns/bandit_orchestrator/config.yml +++ /dev/null @@ -1,10 +0,0 @@ -# Config file using the contextual/thompson pattern and providing its parameters -# This configuration causes the bandit to do no logging of its activity - -pattern: contextual/thompson -num_actions: 10 -dim: 10 - -# Dimension set to a maximum of n=10 skills -# The i-th dimension is the confidence returned by skill i -# Thus number of arms is equal to the dimension for now \ No newline at end of file diff --git a/clai/server/orchestration/patterns/bandit_orchestrator/install.sh b/clai/server/orchestration/patterns/bandit_orchestrator/install.sh deleted file mode 100755 index efdd8fc5..00000000 --- a/clai/server/orchestration/patterns/bandit_orchestrator/install.sh +++ /dev/null @@ -1,51 +0,0 @@ -#!/usr/bin/env bash - -echo "===============================================================" -echo "" -echo " Phase 1: Installing necessary tools" -echo "" -echo "===============================================================" - -DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )" -FRAMEWORK_DIR="${DIR}/framework" - -if [ -d "${FRAMEWORK_DIR}" ]; then - rm -rf "${FRAMEWORK_DIR}" -fi - -mkdir -p "${FRAMEWORK_DIR}" - - -echo " >> Cloning framework libraries" -echo "===============================================================" - -cd "${FRAMEWORK_DIR}" - -git clone -q -b develop --depth 5 https://github.ibm.com/AI-Engineering/python-component-framework.git -git clone -q -b master --depth 5 https://github.ibm.com/AI-Engineering/bandit-core.git - - -echo " >> Installing components library" -echo "===============================================================" - -cd "${FRAMEWORK_DIR}/python-component-framework" -pip install -q --user . -NEW_PYTHONPATH="${FRAMEWORK_DIR}/python-component-framework/pycomp" - - -echo " >> Installing bandit library" -echo "===============================================================" - -cd "${FRAMEWORK_DIR}/bandit-core" -pip install -q --user . -NEW_PYTHONPATH="${NEW_PYTHONPATH}:${FRAMEWORK_DIR}/bandit-core/bandits:${FRAMEWORK_DIR}/bandit-core/patterns" - - -case `grep -F "${NEW_PYTHONPATH}" ~/.bashrc >/dev/null; echo $?` in - 1) - echo "\n\n export PYTHONPATH=\""${PYTHONPATH}":"${NEW_PYTHONPATH}"\"\n" >> ~/.bashrc - ;; -esac - -echo " >> Installing python dependencies" -pip3 install -r requirements.txt \ No newline at end of file diff --git a/clai/server/orchestration/patterns/rltk_bandit_orchestrator/README.md b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/README.md new file mode 100644 index 00000000..b70f65a7 --- /dev/null +++ b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/README.md @@ -0,0 +1,39 @@ +# Bandit-based Orchestration + +> :warning: :warning: This orchestration pattern is developed on top of IBM Research's +internal `rltk` toolkit for reward-based learning and **would not run on general machine**. +You are welcome to develop with your own favorite ML platform until such time `rltk` +becomes open source. + +This is an illustration of an orchestration pattern that learns based on user feedback +using contextual bandits. The context is given by the active skills and their corresponding +self-reported confidences, while the reward is either received: + ++ directly if the user accepts a suggestion with a `y/n` response +(e.g. for the `howdoi` or `man page explorer` skills); or ++ indirectly if they execute a command that follows the suggestion closely +(e.g. for the `nlc2cmd` or `fixit` skills). + +An orchestration layer that can adapt to user interactions over time allows you to +develop CLIs that are personalized to the needs of individual users or user types, +as well as deal with miscalibrated confidences of skills. + +Bandits - and Reinforcement Learning based agents in general - require an initial +phase of exploration which can adversely affect the end-user experience. To bypass +this phase, the bandits can be warm-started with a particular profile. Four profiles +are included in the package: + +- `max-orchestrator`: Starts the bandit orchestrator as a max orchestrator. This behavior +then changes over time with the user behavior. +- `ignore-clai`: Ignores CLAI altogether and treats each command as a native bash command +- `ignore-skill`: Ignores a particular skill while retaining `max-orchestrator` +behavior for the rest, and +- `prefer-skill`: Prefers one skill over another and is useful in scenarios where a user +prefers one skill from a pool of skills with overlapping domains. + +| Warm-start behavior | Preview | +| ----- | ----- | +| `max-orchestrator` | | +| `ignore-clai` | | +| `ignore-nlc2cmd` | | +| `prefer-manpage-over-nlc2cmd` | | diff --git a/clai/server/orchestration/patterns/bandit_orchestrator/__init__.py b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/__init__.py similarity index 100% rename from clai/server/orchestration/patterns/bandit_orchestrator/__init__.py rename to clai/server/orchestration/patterns/rltk_bandit_orchestrator/__init__.py diff --git a/clai/server/orchestration/patterns/rltk_bandit_orchestrator/bandit_config.json b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/bandit_config.json new file mode 100644 index 00000000..dd5db32d --- /dev/null +++ b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/bandit_config.json @@ -0,0 +1,9 @@ +{ + "noop_confidence": 0.1, + "warm_start": true, + "warm_start_config": { + "type": "max-orchestrator", + "kwargs": {} + }, + "reward_match_threshold": 0.7 +} \ No newline at end of file diff --git a/clai/server/orchestration/patterns/rltk_bandit_orchestrator/config.yml b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/config.yml new file mode 100644 index 00000000..ace8b2f9 --- /dev/null +++ b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/config.yml @@ -0,0 +1,10 @@ +# Config file using the contextual/thompson pattern and providing its parameters +# This configuration causes the bandit to do no logging of its activity + +pattern: contextual/thompson +num_actions: 10 +context_size: 10 + +# Number of actions is set to a maximum of 10. This means a maximum of 10 installed skills +# (including a NOOP action) are supported. +# Context size should be equal to the number of actions diff --git a/clai/server/orchestration/patterns/rltk_bandit_orchestrator/install.sh b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/install.sh new file mode 100644 index 00000000..ea9c512b --- /dev/null +++ b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/install.sh @@ -0,0 +1,38 @@ +#!/usr/bin/env bash + +echo "===============================================================" +echo "" +echo " Phase 1: Installing necessary tools" +echo "" +echo "===============================================================" + +DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )" +FRAMEWORK_DIR="${DIR}/framework" + +if [ -d "${FRAMEWORK_DIR}" ]; then + rm -rf "${FRAMEWORK_DIR}" +fi + +mkdir -p "${FRAMEWORK_DIR}" + + +echo " >> Cloning framework libraries" +echo "===============================================================" + +cd "${FRAMEWORK_DIR}" + +# Download and install RLTK library into the rltk folder and uncomment the +# bottom two lines + + +echo " >> Installing RLTK library" +echo "===============================================================" + +# cd "${FRAMEWORK_DIR}/rltk" +# python3 -m pip install -q --user . + + +echo " >> Installing python dependencies" +echo "===============================================================" + +python3 -m pip install -r requirements.txt \ No newline at end of file diff --git a/clai/server/orchestration/patterns/bandit_orchestrator/manifest.properties b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/manifest.properties similarity index 100% rename from clai/server/orchestration/patterns/bandit_orchestrator/manifest.properties rename to clai/server/orchestration/patterns/rltk_bandit_orchestrator/manifest.properties diff --git a/clai/server/orchestration/patterns/bandit_orchestrator/requirements.txt b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/requirements.txt similarity index 100% rename from clai/server/orchestration/patterns/bandit_orchestrator/requirements.txt rename to clai/server/orchestration/patterns/rltk_bandit_orchestrator/requirements.txt diff --git a/clai/server/orchestration/patterns/rltk_bandit_orchestrator/rltk_bandit_orchestrator.py b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/rltk_bandit_orchestrator.py new file mode 100644 index 00000000..92fd507e --- /dev/null +++ b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/rltk_bandit_orchestrator.py @@ -0,0 +1,255 @@ +# +# Copyright (C) 2020 IBM. All Rights Reserved. +# +# See LICENSE.txt file in the root directory +# of this source tree for licensing information. +# + +""" +This example demonstrates the use of Contextual Thompson Sampling +to calibrate the selector for CLAI skills +""" + +from typing import Optional, List, Union +from pathlib import Path + +import os +import json +import numpy as np + +from rltk import instantiate_from_file # pylint: disable=import-error + +from clai.server.orchestration.orchestrator import Orchestrator +from clai.server.command_message import State, Action +from clai.server.command_message import TerminalReplayMemory, TerminalReplayMemoryComplete +from clai.server.logger import current_logger as logger + +from . import warm_start_datagen + + +# pylint: disable=too-many-arguments,unused-argument,too-many-instance-attributes +class RLTKBandit(Orchestrator): + + def __init__(self): + super(RLTKBandit, self).__init__() + + self._config_filepath = os.path.join(Path(__file__).parent.absolute(), 'config.yml') + self._bandit_config_filepath = os.path.join(Path(__file__).parent.absolute(), 'bandit_config.json') + + self._noop_confidence = None + self._agent = None + self._n_actions = None + self._action_order = None + self._warm_start = None + self._warm_start_type = None + self._warm_start_kwargs = None + self._reward_match_threshold = None + + self.load_bandit_state() + self.load_state() + self.warm_start_orchestrator() + + def load_bandit_state(self): + + with open(self._bandit_config_filepath, 'r') as conf_file: + bandit_config = json.load(conf_file) + + self._noop_confidence = bandit_config['noop_confidence'] + self._warm_start = bandit_config['warm_start'] + self._warm_start_type = bandit_config['warm_start_config']['type'] + self._warm_start_kwargs = bandit_config['warm_start_config']['kwargs'] + self._reward_match_threshold = bandit_config.get('reward_match_threshold', 0.7) + + def get_orchestrator_state(self): + + state = { + 'agent': self._agent, + 'action_order': self._action_order, + 'warm_start': self._warm_start + } + return state + + def load_state(self): + + state = self.load() + default_action_order = {self.noop_command: 0} + + self._agent = state.get('agent', None) + if self._agent is None: + self._agent = instantiate_from_file(self._config_filepath) + + self._action_order = state.get('action_order', None) + if self._action_order is None: + self._action_order = default_action_order + + self._n_actions = self._agent.num_actions + self._warm_start = state.get('warm_start', self._warm_start) + + def warm_start_orchestrator(self): + """ + Warm starts the orchestrator (pre-trains the weights) to suit a + particular profile + """ + + def noop_setup(): + profile = 'noop-always' + kwargs = { + 'n_points': 1000, + 'context_size': self._n_actions, + 'noop_position': 0 + } + return profile, kwargs + + def ignore_skill_setup(skill_name): + self.__add_to_action_order__(skill_name) + profile = 'ignore-skill' + kwargs = { + 'n_points': 1000, + 'context_size': self._n_actions, + 'skill_idx': self._action_order[skill_name] + } + return profile, kwargs + + def max_orchestrator_setup(): + profile = 'max-orchestrator' + kwargs = { + 'n_points': 1000, + 'context_size': self._n_actions + } + return profile, kwargs + + def preferred_skill_orchestrator_setup(advantage_skill, disadvantage_skill): + self.__add_to_action_order__(advantage_skill) + self.__add_to_action_order__(disadvantage_skill) + profile = 'preferred-skill' + kwargs = { + 'n_points': 1000, + 'context_size': self._n_actions, + 'advantage_skillidx': self._action_order[advantage_skill], + 'disadvantage_skillidx': self._action_order[disadvantage_skill] + } + return profile, kwargs + + try: + warm_start_methods = { + 'noop': noop_setup, + 'ignore-skill': ignore_skill_setup, + 'max-orchestrator': max_orchestrator_setup, + 'preferred-skill': preferred_skill_orchestrator_setup + } + + method = warm_start_methods[self._warm_start_type.lower()] + profile, kwargs = method(**self._warm_start_kwargs) + + tids, contexts, arm_rewards = warm_start_datagen.get_warmstart_data( + profile, **kwargs + ) + + self._agent.warm_start(tids, arm_rewards, contexts=contexts) + self._warm_start = False + + self.save() + except Exception as err: + logger.warning('Exception in warm starting orchestrator. Error: ' + str(err)) + raise err + + def choose_action(self, + command: State, agent_names: List[str], + candidate_actions: Optional[List[Union[Action, List[Action]]]], + force_response: bool, + pre_post_state: str): + + if not candidate_actions: + return None + + if isinstance(candidate_actions, Action): + candidate_actions = [candidate_actions] + + context = self.__build_context__(candidate_actions) + action_idx = self._agent.choose(t_id=command.command_id, + context=context, + num_arms=1) + suggested_action = self.__choose_action__(action_idx[0], candidate_actions) + + if suggested_action is None: + suggested_action = Action(suggested_command=command.command) + + return suggested_action + + def __build_context__(self, + candidate_actions: Optional[List[Union[Action, List[Action]]]] + ) -> np.array: + + context = [0.0] * self._n_actions + + noop_pos = self._action_order[self.noop_command] + context[noop_pos] = self._noop_confidence + + for action in candidate_actions: + + self.__add_to_action_order__(action.agent_owner) + + pos = self._action_order[action.agent_owner] + conf = self.__calculate_confidence__(action) + context[pos] = conf + + return np.array(context, dtype=np.float) + + def __add_to_action_order__(self, agent_name): + + if agent_name in self._action_order: + return + + max_action_order = max(self._action_order.values()) + self._action_order[agent_name] = max_action_order + 1 + + def __choose_action__(self, + action_idx: int, + candidate_actions: Optional[List[Union[Action, List[Action]]]]): + + suggested_agent = None + for agent_name, agent_idx in self._action_order.items(): + if agent_idx == action_idx: + suggested_agent = agent_name + break + + if suggested_agent == self.noop_command or suggested_agent is None: + return None + + for action in candidate_actions: + if action.agent_owner == suggested_agent: + return action + + return None + + def record_transition(self, + prev_state: TerminalReplayMemoryComplete, + current_state_pre: TerminalReplayMemory): + + try: + prev_state_pre = prev_state.pre_replay + prev_state_post = prev_state.post_replay + + if prev_state_pre.command.action_suggested is None or \ + prev_state_post.command.action_suggested is None or \ + prev_state_post.command.action_suggested.suggested_command == self.noop_command: + return + + reward = float(prev_state_post.command.suggested_executed) + + currently_executed_command = current_state_pre.command.command + all_the_stuff_from_last_execution = \ + prev_state_pre.command.action_suggested.suggested_command + \ + prev_state_pre.command.action_suggested.description + \ + prev_state_post.command.action_suggested.description + + base = set(currently_executed_command.split()) + reference = set(all_the_stuff_from_last_execution.split()) + + # check how much of the current commands is contained in the stuff from last time + match_score = len(base & reference) / len(base) + reward += float(match_score > self._reward_match_threshold) + + self._agent.observe(prev_state.post_replay.command.command_id, reward) + except Exception as err: # pylint: disable=broad-except + logger.warning(f'Error in record_transition of bandit orchestrator. Error: {err}') diff --git a/clai/server/orchestration/patterns/rltk_bandit_orchestrator/warm_start_datagen.py b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/warm_start_datagen.py new file mode 100644 index 00000000..3bf9a2b4 --- /dev/null +++ b/clai/server/orchestration/patterns/rltk_bandit_orchestrator/warm_start_datagen.py @@ -0,0 +1,186 @@ +import numpy as np + + +def get_noop_warmstart_data(n_points, context_size, noop_position): + """ generates warm start data for noop behavior """ + + confidence_vals = np.random.rand(n_points, context_size) + + data_tids = [] + data_contexts = [] + data_arm_rewards = [] + tid = 0 + + for i in range(n_points): + confs = confidence_vals[i] + + for arm in range(context_size): + data_tids.append(f'warm-start-tid-{tid}') + reward = 1.0 if arm == noop_position else -1.0 + + data_contexts.append(confs) + data_arm_rewards.append((arm, reward)) + + tid += 1 + + # randomise order + idxorder = np.random.permutation(n_points) + + data_tids = [data_tids[i] for i in idxorder] + data_contexts = [data_contexts[i] for i in idxorder] + data_arm_rewards = [data_arm_rewards[i] for i in idxorder] + + return data_tids, np.array(data_contexts), data_arm_rewards + + +def get_ignore_skill_warmstart_data(n_points, context_size, skill_idx): + """ generates warm start data for always ignoring a skill behavior """ + + confidence_vals = np.random.rand(n_points, context_size) + + data_tids = [] + data_contexts = [] + data_arm_rewards = [] + tid = 0 + + for i in range(n_points): + confs = confidence_vals[i] + + confs_sortidx = np.argsort(confs) + max_confidx = confs_sortidx[-1] + second_max_confidx = confs_sortidx[-2] + + # Negative reward on choosing the specified skill + reward = -1.0 + data_tids.append(f'warm-start-tid-{tid}') + data_contexts.append(list(confs)) + data_arm_rewards.append((skill_idx, reward)) + tid += 1 + + # Positive reward on selecting the maximum skill + if max_confidx != skill_idx: + reward = +1.0 + data_tids.append(f'warm-start-tid-{tid}') + data_contexts.append(list(confs)) + data_arm_rewards.append((max_confidx, reward)) + tid += 1 + else: + reward = +1.0 + data_tids.append(f'warm-start-tid-{tid}') + data_contexts.append(list(confs)) + data_arm_rewards.append((second_max_confidx, reward)) + tid += 1 + + # randomise order + idxorder = np.random.permutation(n_points) + + data_tids = [data_tids[i] for i in idxorder] + data_contexts = [data_contexts[i] for i in idxorder] + data_arm_rewards = [data_arm_rewards[i] for i in idxorder] + + return data_tids, np.array(data_contexts), data_arm_rewards + + +def get_max_skill_warmstart_data(n_points, context_size): + """ generates warm start data for always ignoring a skill behavior """ + + confidence_vals = np.random.rand(n_points, context_size) + + data_tids = [] + data_contexts = [] + data_arm_rewards = [] + tid = 0 + + for i in range(n_points): + confs = confidence_vals[i] + maxidx = np.argmax(confs) + + for arm in range(context_size): + data_tids.append(f'warm-start-tid-{tid}') + reward = +1.0 if arm == maxidx else -1.0 + + data_contexts.append(confs) + data_arm_rewards.append((arm, reward)) + + tid += 1 + + # randomise order + idxorder = np.random.permutation(n_points) + + data_tids = [data_tids[i] for i in idxorder] + data_contexts = [data_contexts[i] for i in idxorder] + data_arm_rewards = [data_arm_rewards[i] for i in idxorder] + + return data_tids, np.array(data_contexts), data_arm_rewards + + +#pylint: disable=too-many-locals +def get_preferred_skill_warmstart_data(n_points, context_size, advantage_skillidx, disadvantage_skillidx): + """ generates warm start data to prefer one skill over another behavior """ + + confidence_vals = np.random.rand(n_points, context_size) + + data_tids = [] + data_contexts = [] + data_arm_rewards = [] + tid = 0 + + for i in range(n_points): + confs = confidence_vals[i] + confs_sorted_idx = np.argsort(confs) + + max_conf_idx = confs_sorted_idx[-1] + second_max_conf_idx = confs_sorted_idx[-2] + + # Unless the disadvantaged skill has the max confidence and the advantaged + # skill has the second highest, follow the max orchestrator behavior + if max_conf_idx != disadvantage_skillidx and second_max_conf_idx != advantage_skillidx: + reward = +1.0 + data_tids.append(f'warm-start-tid-{tid}') + data_contexts.append(list(confs)) + data_arm_rewards.append((max_conf_idx, reward)) + tid += 1 + + # Make disadvantaged skill highest ranked, and preferred skill second highest + confs[disadvantage_skillidx], confs[max_conf_idx] = confs[max_conf_idx], confs[disadvantage_skillidx] + confs[advantage_skillidx], confs[second_max_conf_idx] = confs[second_max_conf_idx], confs[advantage_skillidx] + + # Negative reward for selecting disadvantaged skill if it has the + # max confidence and the advantaged skill has the second highest + data_tids.append(f'warm-start-tid-{tid}') + data_contexts.append(list(confs)) + data_arm_rewards.append((disadvantage_skillidx, -1.0)) + tid += 1 + + # Positive reward for selecting advantaged skill if it has the + # second highest confidence and the disadvantaged skill has the highest + data_tids.append(f'warm-start-tid-{tid}') + data_contexts.append(list(confs)) + data_arm_rewards.append((advantage_skillidx, +1.0)) + tid += 1 + + # randomise order + idxorder = np.random.permutation(n_points) + + data_tids = [data_tids[i] for i in idxorder] + data_contexts = [data_contexts[i] for i in idxorder] + data_arm_rewards = [data_arm_rewards[i] for i in idxorder] + + return data_tids, np.array(data_contexts), data_arm_rewards + + +def get_warmstart_data(profile, **kwargs): + + if profile.lower() == 'noop-always': + result = get_noop_warmstart_data(**kwargs) + + elif profile.lower() == 'ignore-skill': + result = get_ignore_skill_warmstart_data(**kwargs) + + elif profile.lower() == 'max-orchestrator': + result = get_max_skill_warmstart_data(**kwargs) + + elif profile.lower() == 'preferred-skill': + result = get_preferred_skill_warmstart_data(**kwargs) + + return result diff --git a/clai/server/plugins/dataxplore/README.md b/clai/server/plugins/dataxplore/README.md new file mode 100644 index 00000000..381593cc --- /dev/null +++ b/clai/server/plugins/dataxplore/README.md @@ -0,0 +1,36 @@ +# dataXplore + +`Analytics` `NLP` `Support` + +Data science has become one of the most popular real-world applications of ML. This skills is targeted specifically +toward making the CLI easier to adopt and navigate for data scientists. + +## Implementation + +The current version of the skill provides two functionalities: **summarize** and **plot**. +"Summarize" utilizes the [describe function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.describe.html) of the popular +[Pandas library](https://pandas.pydata.org/pandas-docs/stable/index.html) to +generate a human-readable summary of a specified CSV file; this functionality is intended to allow data scientists +to quickly examine any data file right from the command line. "Plot" builds on the plot function provided by +[MatPlotLib](https://ieeexplore.ieee.org/document/4160265), +and the Pillow library [[link](https://pillow.readthedocs.io/en/stable/index.html)] +[[link](https://www.pythonware.com/products/pil/)] +to generate a plot of a given CSV file. Such functionalities illustrate basic use cases +of how CLAI can be used as a CLI assistant for data science. + +## Example Usage + +`>> clai "dataxplore" summarize air_quality.csv` to view the summary of the give data file. + +`>> clai "dataxplore" plot air_quality.csv` to view a plot of the given data file. + +![figure1](https://www.dropbox.com/s/lin379uw2nc0ts9/dx_summarize_plot_test.png?raw=1) + +![figure2](https://www.dropbox.com/s/j4xxme9eaj92mh5/dx_summarize_plot_airQuality.png?raw=1) + +Both dataset are courtesy of [pandas](http://pandas.pydata.org/). + +## [xkcd](https://uni.xkcd.com/) +The contents of any one panel are dependent on the contents of every panel including itself. The graph of panel dependencies is complete and bidirectional, and each node has a loop. The mouseover text has two hundred and forty-two characters. + +![alt text](https://imgs.xkcd.com/comics/self_description.png "The contents of any one panel are dependent on the contents of every panel including itself. The graph of panel dependencies is complete and bidirectional, and each node has a loop. The mouseover text has two hundred and forty-two characters.") diff --git a/clai/server/plugins/dataxplore/__init__.py b/clai/server/plugins/dataxplore/__init__.py new file mode 100644 index 00000000..f561d365 --- /dev/null +++ b/clai/server/plugins/dataxplore/__init__.py @@ -0,0 +1,6 @@ +# +# Copyright (C) 2020 IBM. All Rights Reserved. +# +# See LICENSE.txt file in the root directory +# of this source tree for licensing information. +# diff --git a/clai/server/plugins/dataxplore/dataxplore.py b/clai/server/plugins/dataxplore/dataxplore.py new file mode 100644 index 00000000..8f0dcffc --- /dev/null +++ b/clai/server/plugins/dataxplore/dataxplore.py @@ -0,0 +1,82 @@ +# +# Copyright (C) 2020 IBM. All Rights Reserved. +# +# See LICENSE.txt file in the root directory +# of this source tree for licensing information. +# + +from clai.server.agent import Agent +from clai.server.command_message import State, Action, NOOP_COMMAND +from clai.tools.colorize_console import Colorize + +from clai.server.logger import current_logger as logger +import pandas as pd +import os +import numpy as np +import matplotlib.cm as cm +import matplotlib.pyplot as plt +import matplotlib.cbook as cbook +from matplotlib.path import Path +from matplotlib.patches import PathPatch +from PIL import Image + +class DATAXPLORE(Agent): + def __init__(self): + super(DATAXPLORE, self).__init__() + # self.service = Service() + + def get_next_action(self, state: State) -> Action: + + # user typed in, in natural language + command = state.command + + try: + logger.info("Command passed in dataxplore: " + command) + commandStr = str(command) + commandTokenized = commandStr.split(" ") + if len(commandTokenized) == 2: + if commandTokenized[0] == 'summarize': + fileName = commandTokenized[1] + csvFile = fileName.split(".") + if len(csvFile) == 2: + if csvFile[1] == 'csv': + path = os.path.abspath(fileName) + data = pd.read_csv(path) + df = pd.DataFrame(data) + response = df.describe().to_string() + else: + response = "We currently support only csv files. Please, Try >> clai dataxplore summarize csvFileLocation " + else: + response = "Not a supported file format. Please, Try >> clai dataxplore summarize csvFileLocation " + elif commandTokenized[0] == 'plot': + fileName = commandTokenized[1] + csvFile = fileName.split(".") + if len(csvFile) == 2: + if csvFile[1] == 'csv': + plt.close('all') + path = os.path.abspath(fileName) + data = pd.read_csv(path,index_col=0, parse_dates=True) + data.plot() + plt.savefig('/tmp/claifigure.png') + im = Image.open('/tmp/claifigure.png') + im.show() + response = "Please, check the popup for figure." + else: + response = "We currently support only csv files. Please, Try >> clai dataxplore plot csvFileLocation " + else: + response = "Not a supported file format. Please, Try >> clai dataxplore plot csvFileLocation " + else: + response = "Try >> clai dataxplore function fileLocation " + else: + response = "Few parts missing. Please, Try >> clai dataxplore function fileLocation " + + confidence = 0.0 + + return Action( + suggested_command=NOOP_COMMAND, + execute=True, + description=Colorize().info().append(response).to_console(), + confidence=confidence) + + except Exception as ex: + return [ { "text" : "Method failed with status " + str(ex) }, 0.0 ] diff --git a/clai/server/plugins/dataxplore/install.sh b/clai/server/plugins/dataxplore/install.sh new file mode 100755 index 00000000..03aaf701 --- /dev/null +++ b/clai/server/plugins/dataxplore/install.sh @@ -0,0 +1,11 @@ +#!/usr/bin/env bash + +echo "===============================================================" +echo "" +echo " Phase 1: Installing necessary tools" +echo "" +echo "===============================================================" + +# Install Python3 dependencies +echo ">> Installing python dependencies" +pip3 install -r requirements.txt \ No newline at end of file diff --git a/clai/server/plugins/dataxplore/manifest.properties b/clai/server/plugins/dataxplore/manifest.properties new file mode 100644 index 00000000..cf98b206 --- /dev/null +++ b/clai/server/plugins/dataxplore/manifest.properties @@ -0,0 +1,4 @@ +[DEFAULT] +name=dataxplore +description=This skill summarizes csv file from your natural language command into a Bash command. +default=yes diff --git a/clai/server/plugins/dataxplore/requirements.txt b/clai/server/plugins/dataxplore/requirements.txt new file mode 100644 index 00000000..d54a7793 --- /dev/null +++ b/clai/server/plugins/dataxplore/requirements.txt @@ -0,0 +1,5 @@ +pandas==1.0.3 +numpy==1.17.2 +matplotlib==3.2.1 +Pillow==7.1.1 +imageloader==0.0.5 diff --git a/clai/server/plugins/gitbot/README.md b/clai/server/plugins/gitbot/README.md new file mode 100644 index 00000000..a91d718b --- /dev/null +++ b/clai/server/plugins/gitbot/README.md @@ -0,0 +1,39 @@ +# gitbot + +`NLP` `Support` `Automation` + +This skill lets you manage and organize your github repository in natural language. +It also lets you use natural language commands to issue popular git commands. + +## Implementation + +The skill demonstrates hooks into two interesting design patterns: + ++ Similar to the [`nlc2cmd`](../nlc2cmd/) skill, it demonstrates natural language to +command patterns. However, in contrast to the nlc2cmd implementation, here we demonstrate +how to use a natural language classifier local to the machine -- using [RASA](https://rasa.com/) -- +instead of calling an external service like [Watson Assistant](https://www.ibm.com/cloud/watson-assistant/). + ++ This skill also demonstrates instances of workflow automation in the context of code +development by using the [GitHub Actions API](https://github.com/features/actions). + +Similar to the `nlc2cmd` and `ibmcloud` skills, this skill is also merely illustrative +of integration of natural language and automation in code management through GitHub. +Contributions are welcome to improve the accuracy of the natural language interpretation, +the breadth of the use cases covered for workflow automation, or new features! + +Before trying it out: + +> Fill out `sample_config.json` and rename it to `config.json` + +> run `./run_rasa.sh 5556` + +## Example Usage + +![gitbot](https://www.dropbox.com/s/7snw9sg3ab15rvr/gitbot.png?raw=1) + +## [xkcd](https://uni.xkcd.com/) + +If that doesn't fix it, git.txt contains the phone number of a friend of mine who understands git. Just wait through a few minutes of 'It's really pretty simple, just think of branches as...' and eventually you'll learn the commands that will fix everything. + +![alt text](https://imgs.xkcd.com/comics/git_2x.png "If that doesn't fix it, git.txt contains the phone number of a friend of mine who understands git. Just wait through a few minutes of 'It's really pretty simple, just think of branches as...' and eventually you'll learn the commands that will fix everything.") diff --git a/clai/server/plugins/gitbot/__init__.py b/clai/server/plugins/gitbot/__init__.py new file mode 100644 index 00000000..f561d365 --- /dev/null +++ b/clai/server/plugins/gitbot/__init__.py @@ -0,0 +1,6 @@ +# +# Copyright (C) 2020 IBM. All Rights Reserved. +# +# See LICENSE.txt file in the root directory +# of this source tree for licensing information. +# diff --git a/clai/server/plugins/gitbot/gitbot.py b/clai/server/plugins/gitbot/gitbot.py new file mode 100644 index 00000000..d44a956b --- /dev/null +++ b/clai/server/plugins/gitbot/gitbot.py @@ -0,0 +1,46 @@ +# +# Copyright (C) 2020 IBM. All Rights Reserved. +# +# See LICENSE.txt file in the root directory +# of this source tree for licensing information. +# + +''' imports ''' +from clai.server.agent import Agent +from clai.server.command_message import State, Action, NOOP_COMMAND + +from clai.server.plugins.gitbot.service import Service +from clai.tools.colorize_console import Colorize + +import os + +''' main gitbot class ''' +class GITBOT(Agent): + def __init__(self): + super(GITBOT, self).__init__() + self.service = Service() + + # KILL THE RASA SERVER + # WARNING KILLS OTHERS TOO + # os.system('lsof -t -i tcp:{} | xargs kill'.format(_rasa_port_number)) + # BRING UP THE RASA SERVER + # os.system('rasa run --enable-api -m {} -p {}'.format(_path_to_rasa_model, _rasa_port_number)) + + ''' pre execution processing ''' + def get_next_action(self, state: State) -> Action: + command = state.command + return self.service(command) + + ''' pre execution processing ''' + def post_execute(self, state: State) -> Action: + # for a more sophisticated state change mechanism, see the ibmcloud skill + if state.result_code == '0': self.service.parse_command(state.command, stdout = "") + return Action(suggested_command=NOOP_COMMAND) + + def save_agent(self) -> bool: + # KILL THE RASA SERVER + os.system('lsof -t -i tcp:{} | xargs kill'.format(_rasa_port_number)) + + # return to original destruction method + super().save_agent() + diff --git a/clai/server/plugins/gitbot/install.sh b/clai/server/plugins/gitbot/install.sh new file mode 100755 index 00000000..161778fe --- /dev/null +++ b/clai/server/plugins/gitbot/install.sh @@ -0,0 +1,12 @@ +#!/usr/bin/env bash + +echo "===============================================================" +echo "" +echo " Phase 1: Installing necessary tools" +echo "" +echo "===============================================================" + +# Install Python3 dependencies +echo ">> Installing python dependencies" +pip3 install -r requirements.txt + diff --git a/clai/server/plugins/gitbot/manifest.properties b/clai/server/plugins/gitbot/manifest.properties new file mode 100644 index 00000000..7e3e516c --- /dev/null +++ b/clai/server/plugins/gitbot/manifest.properties @@ -0,0 +1,4 @@ +[DEFAULT] +name=gitbot +description=This skill helps you manage your github repositories. +default=yes \ No newline at end of file diff --git a/clai/server/plugins/gitbot/rasa/__init__.py b/clai/server/plugins/gitbot/rasa/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/clai/server/plugins/gitbot/rasa/config.yml b/clai/server/plugins/gitbot/rasa/config.yml new file mode 100644 index 00000000..e22e49c7 --- /dev/null +++ b/clai/server/plugins/gitbot/rasa/config.yml @@ -0,0 +1,26 @@ +# Configuration for Rasa NLU. +# https://rasa.com/docs/rasa/nlu/components/ +language: en +pipeline: + - name: WhitespaceTokenizer + - name: RegexFeaturizer + - name: LexicalSyntacticFeaturizer + - name: CountVectorsFeaturizer + - name: CountVectorsFeaturizer + analyzer: "char_wb" + min_ngram: 1 + max_ngram: 4 + - name: DIETClassifier + epochs: 100 + - name: EntitySynonymMapper + - name: ResponseSelector + epochs: 100 + +# Configuration for Rasa Core. +# https://rasa.com/docs/rasa/core/policies/ +policies: + - name: MemoizationPolicy + - name: TEDPolicy + max_history: 5 + epochs: 100 + - name: MappingPolicy diff --git a/clai/server/plugins/gitbot/rasa/data/nlu.md b/clai/server/plugins/gitbot/rasa/data/nlu.md new file mode 100644 index 00000000..3a5d7c85 --- /dev/null +++ b/clai/server/plugins/gitbot/rasa/data/nlu.md @@ -0,0 +1,36 @@ +## intent:commit +- commit +- commit changes + +## intent:push +- push +- push changes +- push changes into [vv](target) +- push changes into [ww](target) +- push changes from [aaa](source) into [xxx](target) +- push changes from [bbb](source) into [yyy](target) +- push changes from [ccc](source) into [zzz](target) +- push changes from [master](source) into [develop](target) +- push changes from [develop](source) into [master](target) + +## intent:merge +- merge +- merge into [vv](target) +- merge into [ww](target) +- merge [aaa](source) into [xxx](target) +- merge [bbb](source) into [yyy](target) +- merge [ccc](source) into [zzz](target) +- merge [master](source) into [develop](target) +- merge [develop](source) into [master](target) + +## intent:comment +- add comment to issue [number][id] "[comment](comment)" +- add comment to PR [number][id] "[comment](comment)" +- add comment to issue [1][id] "[aaa](comment)" +- add comment to PR [1][id] "[xxx](comment)" +- add comment to issue [2][id] "[bbb](comment)" +- add comment to PR [2][id] "[yyy](comment)" +- add comment to issue [3][id] "[ccc](comment)" +- add comment to PR [3][id] "[zzz](comment)" +- add comment to issue [id][id] "[new comment](comment)" +- add comment to PR [id][id] "[new comment](comment)" diff --git a/clai/server/plugins/gitbot/rasa/models/rasa-saved-nlu-model.tar.gz b/clai/server/plugins/gitbot/rasa/models/rasa-saved-nlu-model.tar.gz new file mode 100644 index 00000000..b4f0519e Binary files /dev/null and b/clai/server/plugins/gitbot/rasa/models/rasa-saved-nlu-model.tar.gz differ diff --git a/clai/server/plugins/gitbot/requirements.txt b/clai/server/plugins/gitbot/requirements.txt new file mode 100644 index 00000000..57900691 --- /dev/null +++ b/clai/server/plugins/gitbot/requirements.txt @@ -0,0 +1,2 @@ +requests +rasa \ No newline at end of file diff --git a/clai/server/plugins/gitbot/run_rasa.sh b/clai/server/plugins/gitbot/run_rasa.sh new file mode 100755 index 00000000..88d7e198 --- /dev/null +++ b/clai/server/plugins/gitbot/run_rasa.sh @@ -0,0 +1,6 @@ +#!/usr/bin/env bash + +echo "Bringing up RASA server" + +lsof -t -i tcp:$1 | xargs kill +rasa run --enable-api -m /Users/tathagata/clai/clai/server/plugins/gitbot/rasa/models/rasa-saved-nlu-model.tar.gz -p $1 diff --git a/clai/server/plugins/gitbot/sample_config.json b/clai/server/plugins/gitbot/sample_config.json new file mode 100644 index 00000000..2e8c1fdd --- /dev/null +++ b/clai/server/plugins/gitbot/sample_config.json @@ -0,0 +1,8 @@ +{ + "github_personal_access_token" : "", + "github_username" : "", + "github_repo" : ["org-name or user-name", "repo-name"], + "path_to_log_file" : "absolute/path/to/log/file/with/write/permissions", + "path_to_rasa_saved_model" : "relative/path/to/rasa/models/rasa-saved-nlu-model.tar.gz", + "rasa_port_number" : "port-number" +} diff --git a/clai/server/plugins/gitbot/service.py b/clai/server/plugins/gitbot/service.py new file mode 100644 index 00000000..350781e5 --- /dev/null +++ b/clai/server/plugins/gitbot/service.py @@ -0,0 +1,194 @@ +# +# Copyright (C) 2020 IBM. All Rights Reserved. +# +# See LICENSE.txt file in the root directory +# of this source tree for licensing information. +# + +''' imports ''' +from typing import List +from datetime import datetime +from clai.server.command_message import Action, NOOP_COMMAND +from clai.tools.colorize_console import Colorize + +import requests +import json +import os + +''' globals ''' +_real_path = '/'.join(os.path.realpath(__file__).split('/')[:-1]) +_path_to_config_file = _real_path + '/config.json' + +_config = json.loads( open(_path_to_config_file).read() ) + +_github_access_token = _config["github_personal_access_token"] +_github_username = _config["github_username"] +_github_repo = _config["github_repo"] +_github_url = "https://api.github.com/repos/{}".format("/".join(_github_repo)) + +_path_to_log_file =_config["path_to_log_file"] + +_rasa_port_number = _config["rasa_port_number"] +_rasa_service = 'http://localhost:{}/model/parse'.format(_rasa_port_number) + +class Service: + def __init__(self): + self.state = { + "ready_flag" : False, + "threshold" : 0.9, + "current_intent" : None, + "current_branch" : None, + "commit_details" : None, + "has_done_add" : False, + "has_done_commit" : False + } + + self.gh_session = requests.Session() + self.gh_session.auth = (_github_username, _github_access_token) + + + def __call__(self, *args, **kwargs): + + # Set user command + command = args[0] + confidence = 0.0 + + try: + + response = requests.post(_rasa_service, json={'text': command}).json() + confidence = response["intent"]["confidence"] + + if confidence > self.state["threshold"]: + self.state["current_intent"] = response["intent"]["name"] + + except Exception as ex: + print("Method failed with status " + str(ex)) + + if self.state["current_intent"] == "commit": + + if not self.state["ready_flag"]: + self.state["ready_flag"] = True + temp_description = "Ready to {}. Press execute to continue.".format(self.state["current_intent"]) + + return [ Action( + suggested_command="git status | tee {}".format(_path_to_log_file), + execute=True, + description=None, + confidence=1.0 + ), Action( + suggested_command=NOOP_COMMAND, + execute=True, + description=Colorize().info().append(temp_description).to_console(), + confidence=1.0 + ) ] + + if command == "execute": + + now = datetime.now() + commit_message = now.strftime("%d/%m/%Y %H:%M:%S") + + # this should really go into post-execute # + stdout = open(_path_to_log_file).read().split("\n") + commit_description = "" + current_branch = None + + for line in stdout: + + if line.startswith("On branch"): + current_branch = line.split()[-1] + + if line.strip().startswith("modified:"): + commit_description += line.strip() + " + " + + if commit_description: self.state["commit_details"] = commit_description + if current_branch: self.state["current_branch"] = current_branch + + commit_description = self.state["commit_details"] + + respond = Action( + suggested_command='git commit -m "{}" -m "{}";'.format(commit_message, commit_description), + execute=False, + description=None, + confidence=1.0 + ) + + if self.state["has_done_add"]: + return respond + + else: + return [ Action( + suggested_command='git add -A', + execute=True, + description="Adding untracked files...", + confidence=1.0 + ), respond] + + + if self.state["current_intent"] == "push": + return Action( + suggested_command='git push', + execute=False, + description=None, + confidence=confidence + ) + + if self.state["current_intent"] == "merge": + merge_url = "{}/pulls".format(_github_url) + + source = None + target = "master" + + for entity in response["entities"]: + if entity["entity"] == "source": source = entity["value"] + if entity["entity"] == "target": target = entity["value"] + + if not source: source = self.state["current_branch"] + + payload = { "title" : "Dummy PR from gitbot", + "body" : "Example from the `gitbot` screencast. This will not be merged.", + "head" : source, + "base" : target } + + ping_github = json.loads(self.gh_session.post(merge_url, json=payload).text) + return Action( + suggested_command=NOOP_COMMAND, + execute=True, + description="Success. Created PR {}".format(ping_github["number"]), + confidence=confidence + ) + + if self.state["current_intent"] == "comment": + + idx = 0 + comment = None + + for entity in response["entities"]: + if entity["entity"] == "id": idx = entity["value"] + if entity["entity"] == "comment": comment = entity["value"] + + idx = command.split('<')[-1].split('>')[0] + comment_url = "{}/issues/{}/comments".format(_github_url, idx) + + payload = { "body" : comment } + + ping_github = json.loads(self.gh_session.post(comment_url, json=payload).text) + return Action( + suggested_command=NOOP_COMMAND, + execute=True, + description="Success", + confidence=confidence + ) + + + def parse_command(self, command: str, stdout: str): + + if command.startswith( "git checkout" ): + self.state["current_branch"] = None + + if command.startswith( "git add" ): + self.state["has_done_add"] = True + + if command.startswith( "commit" ): + self.state["has_done_add"] = False + self.state["has_done_commit"] = True + diff --git a/clai/server/plugins/nlc2cmd/service.py b/clai/server/plugins/nlc2cmd/service.py index 48a64292..c5522b34 100644 --- a/clai/server/plugins/nlc2cmd/service.py +++ b/clai/server/plugins/nlc2cmd/service.py @@ -5,13 +5,11 @@ # of this source tree for licensing information. # -import time -import threading - # itemgetter is faster than lambda functions for sorting from operator import itemgetter import clai.server.plugins.nlc2cmd.wa_skills as wa_skills +import threading class Service: def __init__(self): diff --git a/clai/server/plugins/tellina/README.md b/clai/server/plugins/tellina/README.md new file mode 100644 index 00000000..e10b0540 --- /dev/null +++ b/clai/server/plugins/tellina/README.md @@ -0,0 +1,52 @@ +# tellina + +`NLP` `Support` + +This skill expands on the functionality of the [`nlc2cmd`](https://github.com/IBM/clai/tree/master/clai/server/plugins/nlc2cmd) skill by +providing a direct interface to the [`Tellina`](https://github.com/TellinaTool/) system. + +## Implementation + +The natural language input from the user at the command line is sent as a query to an +instance of the [`Tellina`](http://tellina.rocks/) system, which runs [`NL2Bash`](https://github.com/TellinaTool/nl2bash/) as its +translation engine. `NL2Bash` uses [`TensorFlow`](https://www.tensorflow.org/) to create a model of the top `Bash` [*utilities*](https://github.com/TellinaTool/nl2bash/tree/master/data/bash), and predicts a relevant utility for the user's natural language query. After that, it uses *Abstract Syntax Trees* (ASTs) to predict the correct options (flags) to use with that predicted utility. + +### Scoring & Confidence + +Currently, the `tellina` skill returns a default confidence of `0.0`. We are working on a confidence scoring function that retrieves the prediction probabilities from TensorFlow and converts them into a confidence score for use with `CLAI`. + +Use direct invocation to use `tellina` even with zero confidence. + +``` +>> clai "tellina" [command] +``` + +## Example Usage + +![clai-tellina-gif](https://www.dropbox.com/s/063tmajzchskvws/tellina.gif?raw=1) + +### Sample Invocations + +Try out the invocations below for a quick start! + +1. `>> clai show me all files` +2. `>> clai display files sorted by number of lines` +3. `>> clai tellina show me all running processes` +4. `>> clai print largest files` +5. `>> clai print space separated numbers from 1 to 5` +6. `>> clai recursively add ".jpg" to all files in the current directory tree` +7. `>> clai number of lines in "foo.txt"` +8. `>> clai exit terminal` + +The [NL2Bash data](https://github.com/TellinaTool/nl2bash/blob/master/data/bash/all.nl) features an extensive list of natural language invocations that can be used with the Tellina system. + +## :star: NeurIPS 2020 NLC2CMD Competition + +We are excited to announce the [`NeurIPS 2020 NLC2CMD competition`](http://ibm.biz/nlc2cmd), which builds on Tellina and the NL2Bash data. To sign-up for the competition, click [here](http://nlc2cmd.us-east.mybluemix.net/#/participate)! + +## [xkcd](https://uni.xkcd.com/) + + +The pile gets soaked with data and starts to get mushy over time, so it's technically recurrent. + +![alt text](https://imgs.xkcd.com/comics/machine_learning.png "The pile gets soaked with data and starts to get mushy over time, so it's technically recurrent.") diff --git a/clai/server/plugins/tellina/__init__.py b/clai/server/plugins/tellina/__init__.py new file mode 100644 index 00000000..f561d365 --- /dev/null +++ b/clai/server/plugins/tellina/__init__.py @@ -0,0 +1,6 @@ +# +# Copyright (C) 2020 IBM. All Rights Reserved. +# +# See LICENSE.txt file in the root directory +# of this source tree for licensing information. +# diff --git a/clai/server/plugins/tellina/install.sh b/clai/server/plugins/tellina/install.sh new file mode 100755 index 00000000..03aaf701 --- /dev/null +++ b/clai/server/plugins/tellina/install.sh @@ -0,0 +1,11 @@ +#!/usr/bin/env bash + +echo "===============================================================" +echo "" +echo " Phase 1: Installing necessary tools" +echo "" +echo "===============================================================" + +# Install Python3 dependencies +echo ">> Installing python dependencies" +pip3 install -r requirements.txt \ No newline at end of file diff --git a/clai/server/plugins/tellina/manifest.properties b/clai/server/plugins/tellina/manifest.properties new file mode 100644 index 00000000..82eeea64 --- /dev/null +++ b/clai/server/plugins/tellina/manifest.properties @@ -0,0 +1,4 @@ +[DEFAULT] +name=tellina +description=This skill translates your natural language command into a Bash command. +default=yes \ No newline at end of file diff --git a/clai/server/plugins/tellina/requirements.txt b/clai/server/plugins/tellina/requirements.txt new file mode 100644 index 00000000..663bd1f6 --- /dev/null +++ b/clai/server/plugins/tellina/requirements.txt @@ -0,0 +1 @@ +requests \ No newline at end of file diff --git a/clai/server/plugins/tellina/tellina.py b/clai/server/plugins/tellina/tellina.py new file mode 100644 index 00000000..dc528215 --- /dev/null +++ b/clai/server/plugins/tellina/tellina.py @@ -0,0 +1,47 @@ +# +# Copyright (C) 2020 IBM. All Rights Reserved. +# +# See LICENSE.txt file in the root directory +# of this source tree for licensing information. +# + +from clai.server.agent import Agent +from clai.server.command_message import State, Action, NOOP_COMMAND +from clai.tools.colorize_console import Colorize + +from clai.server.logger import current_logger as logger + +import requests + +''' globals ''' +tellina_endpoint = 'http://nlc2cmd.sl.res.ibm.com:8000/api/translate' + + +class TELLINA(Agent): + def __init__(self): + super(TELLINA, self).__init__() + + def get_next_action(self, state: State) -> Action: + + # user typed in, in natural language + command = state.command + + try: + + ## Needs to be a post request since service/endpoint is configured for post + endpoint_comeback = requests.post(tellina_endpoint, json={'command': command}).json() + ## tellina endpoint must return a json with + + # tellina response, the cmd for the user NL utterance + response = 'Try >> ' + endpoint_comeback['response'] + # the confidence; the tellina endpoint currently returns 0.0 + confidence = float(endpoint_comeback['confidence']) + + return Action( + suggested_command=NOOP_COMMAND, + execute=True, + description=Colorize().info().append(response).to_console(), + confidence=confidence) + + except Exception as ex: + return [ { "text" : "Method failed with status " + str(ex) }, 0.0 ] diff --git a/clai/tools/docker_utils.py b/clai/tools/docker_utils.py index 18a5be21..c6a09263 100644 --- a/clai/tools/docker_utils.py +++ b/clai/tools/docker_utils.py @@ -56,6 +56,4 @@ def execute_cmd(container, command): sleep(1) socket.output._sock.send(b"exit\n") - - print(f'the output is: {data}') return str(data) diff --git a/configPlugins.json b/configPlugins.json index 3d6d30a6..49f25b49 100644 --- a/configPlugins.json +++ b/configPlugins.json @@ -1 +1 @@ -{"selected": {"user": ["nlc2cmd"]}, "default": ["nlc2cmd"], "default_orchestrator": "max_orchestrator", "installed": [], "report_enable": false} \ No newline at end of file +{"selected": {"user": ["nlc2cmd"]}, "default": ["tellina"], "default_orchestrator": "max_orchestrator", "installed": [], "report_enable": false} diff --git a/docs/FAQ.md b/docs/FAQ.md index 20458038..9de8de3b 100644 --- a/docs/FAQ.md +++ b/docs/FAQ.md @@ -6,7 +6,7 @@ Welcome to the world of command line AI. You can contribute in any form, inlcudi ## What data is being saved? -If you agree to contribute your data during the installation process (optional) we will store 1) statistics on which skills you use; and 2) the `State` and `Action` information outlined [here](../clai/server/plugins/). This will help us improve over time. The data from (2) can be used to train learning agents such as the one outlined [here](../clai/server/orchestration/patterns/bandit_orchestrator/) and may be made available in anonymized form for the community to use as training data. Individual skills do no store any data, unless otherwise mentioned explicitly while they are being installed. +If you agree to contribute your data during the installation process (optional) we will store 1) statistics on which skills you use; and 2) the `State` and `Action` information outlined [here](../clai/server/plugins/). This will help us improve over time. The data from (2) can be used to train learning agents such as the one outlined [here](../clai/server/orchestration/patterns/rltk_bandit_orchestrator/) and may be made available in anonymized form for the community to use as training data. Individual skills do no store any data, unless otherwise mentioned explicitly while they are being installed. ### Do passwords get sent to the skills? diff --git a/test_integration/contract_skills.py b/test_integration/contract_skills.py index a42d4bea..e3fda1d3 100644 --- a/test_integration/contract_skills.py +++ b/test_integration/contract_skills.py @@ -33,9 +33,11 @@ def test_install(self, my_clai_module): command_executed = execute_cmd(my_clai_module, command_select) - assert f"☑\x1b[32m {skill_name} (Installed)" in command_executed + assert f"☑\x1b[32m {skill_name} (Installed)" in command_executed, \ + f'Skill {skill_name} not found installed. Output: {command_executed}' @pytest.mark.dependency(depends=['test_install']) def test_skill_values(self, my_clai_module, command, command_expected): command_executed = execute_cmd(my_clai_module, command) - assert command_expected in command_executed + assert command_expected in command_executed, \ + f'Expected: {command_expected}, Received: {command_executed}' diff --git a/test_integration/test_skill_tellina.py b/test_integration/test_skill_tellina.py new file mode 100644 index 00000000..be4f0856 --- /dev/null +++ b/test_integration/test_skill_tellina.py @@ -0,0 +1,24 @@ +# +# Copyright (C) 2020 IBM. All Rights Reserved. +# +# See LICENSE.txt file in the root directory +# of this source tree for licensing information. +# + +from test_integration.contract_skills import ContractSkills + + +class TestSkillTellina(ContractSkills): + + def get_skill_name(self): + return 'tellina' + + def get_commands_to_execute(self): + return ['pwd', + 'clai "tellina" exit terminal', + 'clai "tellina" show me all files'] + + def get_commands_expected(self): + return ['/opt/IBM/clai', + 'exit', + 'find .']