Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brainstorming concept : DataTonic, help and find the most optimized LLM model for an user usecase #3

Closed
Zochory opened this issue Dec 16, 2023 · 3 comments
Assignees

Comments

@Zochory
Copy link
Contributor

Zochory commented Dec 16, 2023

Problem :

Today there is a large choice of custom and sometime not accessible or complex, time consuming, would need to pay a subscription to *some bullshit AI solution over priced", would need to pay fair subscription price, or to pay expensive a freelance or agency to do it.

Concept :

Datatonic uses autogen, truegen, its own prompt engenering, its agent...

is able to evaluate and do testing in order to find the proper LLM model for given scenario
take into account multimodel (sound, image, text) using Gemini and Truegen.

Benefits for end users/companies :

  • Find cost efficient "bundle" for anyone with or without prior knowledge
  • Transparent with personalization and flexibility
  • always up to date, so always useful (not a temporary concept that will loss relevancy)
  • Give a clear documentation for part that Datatonic can't do for security reason
  • No lockup user is free to own its bundles prebuilt or just documented

Benefit for us :

  • Will probably always have usefulness, gave us additional data/insight through time
  • Business model with multiple income (pay as you go, agency service) (?)
  • Probably more

Limitation:
Seem complex, too irrealistic ?

Scenario example

User :

↳ Add a credential

  • Pay as you go
  • Bring your own key

↳ Explain what he want to accomplish

  • eg: I love monkeys, and i want a model able to ingest books in PDF about monkeys that can contain image about monkey, or youtube video that can also able to get sounds from it , or photo.

System (still need more information so will follow ask follow up until system has all he needs :

↳ Budget per token max

  • Dollar giving tokens number _(maybe system could make an a priori guess estimate of the number of tokens that would be consume for a month/week ?)

↳ Currently using solutions to add as integration :

  • Integration CTA
  • Postgres credential

↳ Specific demands for it like for eg :

  • "Must use Cohere"
  • "OSS LLM"...
  • "Intensive and often ingestion or From time to time ?"
  • "Chat history ? Conversation ? Persistant memory ? "

↳ When user has done and system has all infos he needs open draggrable list with each component from users

  • Budget
  • Must use Cohere
  • OSS LLM
  • ...

↳ Enterprise extra security needed ?

↳ ...

↳ ...

↳ User can re-order the list based on the importance
↳ Or other kind of measure like a typical five-level Likert item (trivial , not important, important, very important)

↳ Datatonic through its datasets (can be some benchmark of Transformers, embedding/chunking...), Existing evaluations based on the proper metric, on its own evaluation already made and often updated between each langage models, multimodal models (would be more complex (?))

↳ Provide few bundles possibilities (without to have to create each (possible?)
↳ User choose one of these, or through chat input ask for ajustment

Final step 🕺🏿↳ When user has chosen its bundle, Datatonic starts the work and user wil be notified when it's done

### Final step seem too much and seem to add too much complexity and seems a bit a non sense ? when we could provide detailed documentations instead... and give choice to make us built it for him/company as an agency with a support and on boarding

@Josephrp
Copy link
Member

just one point first : the user intention seems like it's not an objective and more of a technique, which is not exactly what i would expect in the user section. this is too complex to put in work right now , let's do an MVP with a single configuration , that sometimes works at least, then it will be easier to provide alternatives. in fact we can start evaluating alternatives in the context of the TruEra evaluation .

@Josephrp
Copy link
Member

Data Driven Advisory (Use Case)

Phase 1: Engagement Setup (1-2 weeks)

Client Background Information: Company history, mission, vision, and strategic objectives.
Industry Data: Market size, trends, competitors, and regulatory environment.
Stakeholder Information: Key stakeholders, organizational structure, and decision-makers.

Phase 2: Data Gathering and Analysis (3-6 weeks)

Operational Data: Sales figures, production data, supply chain details, employee information.
Financial Data: Profit and loss statements, balance sheets, cash flow statements, budgets.
Customer Data: Customer demographics, satisfaction surveys, purchase history.
Internal Documents: Previous strategy documents, reports, internal analyses.

Phase 3: In-Depth Analysis and Hypothesis Testing (4-8 weeks)

Segmented Data: More detailed operational and financial data broken down by business unit, geography, product line, etc.
Competitive Intelligence: Detailed competitor analysis, market share, business models.
Benchmarking Data: Industry benchmarks, best practices, case studies.
Qualitative Data: Interviews, focus groups, expert opinions.

Phase 4: Solution Development and Validation (2-4 weeks)

Scenario Analysis Data: For testing different strategic options and their potential outcomes.
Risk Assessment Data: Data related to potential risks and mitigation strategies.
Feedback Data: Initial feedback on proposed solutions from a small group of stakeholders or pilot tests.

Phase 5: Final Recommendations and Implementation Planning (2-3 weeks)

Consolidated Analysis: Summarized data and analysis that support the final recommendations.
Stakeholder Feedback: Comprehensive feedback on proposed recommendations.
Implementation Data: Resources required for implementation, timelines, and milestones.

Phase 6: Implementation Support and Closure (Variable)

Performance Data: Metrics and KPIs to track the implementation progress.
Adjustment Data: Ongoing data collection for adjusting strategies as needed.
Final Outcome Data: Data reflecting the impact of the implemented solutions.

Post-Engagement (Optional)

Long-term Impact Data: Data collected over time to assess the long-term impact of the engagement.
Follow-up Feedback: Stakeholder feedback on the effectiveness and outcomes of the project.

Statement Of Work:

(example of a single fixed output useable by autogen)

Overview: Brief description of the client's organization and the context of the engagement.
Purpose of the SoW: Clarification of the document's intent and its role as a guiding agreement.

Project Objectives and Scope

Objectives: Clear and specific goals the project aims to achieve.
Scope of Work: Detailed description of the services and tasks to be performed. This section delineates what is included and, just as importantly, what is not included in the engagement.

Project Approach and Methodology

Methodology: Explanation of the methodologies, frameworks, or strategies the consulting team will employ.
Phases of Work: Breakdown of the project into phases or milestones, each with specific tasks and objectives.

Deliverables

List of Deliverables: Detailed list of expected outputs, reports, presentations, tools, or models to be provided.
Quality Standards: Description of the standards or criteria against which the deliverables will be assessed. 

Timeline

Project Timeline: Detailed timeline of the project, including start and end dates, phase durations, and key milestones.
Review Points: Scheduled points for reviewing progress and adjusting plans as necessary.

*6. Roles and Responsibilities

Consulting Team Composition: Names and roles of the consultants involved.
Client Responsibilities: Specific tasks or inputs required from the client, such as data provision, key personnel involvement, etc.**

Pricing and Payment Terms

Fee Structure: Details on how the consulting fees are structured - fixed fee, time and materials, etc.
Payment Schedule: Timeline and conditions for payments.

Confidentiality, Legal, and Ethical Considerations

Confidentiality Clauses: Terms ensuring the confidentiality of shared information.
Legal and Compliance Aspects: Adherence to relevant laws and industry regulations.
Ethical Standards: Commitment to maintaining high ethical standards during the engagement.

Terms and Conditions

Contractual Terms: General terms including contract duration, termination conditions, dispute resolution mechanisms, etc.
Amendment Process: Process for making changes to the SoW.

Signatures

Sign-off: Signatures from authorized representatives of both the consulting firm and the client.

@Josephrp
Copy link
Member

ok since everyone is aligned, i made a new issue, so we can try it like that now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

5 participants