# Classification by LLM Model
- Classify Papers into 10 Categories Using OpenAI GPT Model("gpt-4.1-mini")  
- 10 Categories object
  - 1: AI- and Machine Learning Applications in IS  
  - 2: IS Security, Privacy, and Technology Adoption  
  - 3: Social Media and Online Community Behavior  
  - 4: Digital Transformation and IT Strategy in Organizations  
  - 5: Data Analytics and IS Performance  
  - 6: HCI, UX, and Interface Design in IS  
  - 7: ICT for Digital Inclusion and Social Equity  
  - 8: Platform Business Models and Digital Markets  
  - 9: Green IS, CSR, and Sustainability  
  - 10: IS Theories, Methods, and Meta-Research
  
> Used Libraries: Pandas, Langchain

In [34]:
import pandas as pd
import dotenv
import os
from langchain_core.prompts import PromptTemplate
from IPython.display import display_markdown

# CONFIG
dotenv.load_dotenv()
os.environ["LANGSMITH_PROJECT"] = 'JAIS Paper Theme Classification Using LLM'

True

In [35]:
# Read the CSV file
papers_df = pd.read_csv('JAIS_papers.csv', encoding='utf-8-sig')
papers_df.head()

Unnamed: 0,title,abstract,citation,url,year,volume,issue
0,Explaining Persistent Ineffectiveness in Profe...,Abstract\nOnline communities (OCs) have become...,"Recommended Citation\n\n\n Stein, Mari-Klar...",https://aisel.aisnet.org/jais//vol23/iss1/1/,2022,23,1
1,A Design Theory for Energy and Carbon Manageme...,Abstract\nEnergy and carbon management systems...,"Recommended Citation\n\n\n Zampou, Eleni; M...",https://aisel.aisnet.org/jais//vol23/iss1/2/,2022,23,1
2,Examining the Impacts of Airbnb Review Policy ...,"Abstract\nIn July 2014, Airbnb, one of the big...","Recommended Citation\n\n\n Mousavi, Reza an...",https://aisel.aisnet.org/jais//vol23/iss1/3/,2022,23,1
3,Inventing Together: The Role of Actor Goals an...,Abstract\nWith the ubiquity of the internet an...,"Recommended Citation\n\n\n Abhari, Kaveh; D...",https://aisel.aisnet.org/jais//vol23/iss1/4/,2022,23,1
4,The Design of a System for Online Psychosocial...,Abstract\nThe design of sensitive online healt...,"Recommended Citation\n\n\n Sjöström, Jonas;...",https://aisel.aisnet.org/jais//vol23/iss1/5/,2022,23,1


## Prompt Template

In [50]:
system_prompt = """
You are a classification assistant trained to identify the main research theme of academic papers published in the *Journal of the Association for Information Systems (JAIS)*.

You will be given the **title** and **abstract** of an IS (Information Systems) research paper.

Your task is to **assign exactly one of the following ten categories** that best represents the paper’s core theme. Each category is defined by a **specific scope of research problems, methods, or technological domain**.

## Category List

1. **AI- and Machine Learning-Driven Information System Design and Organizational Applications**  
   Research on how AI/ML technologies such as predictive models, algorithmic decision-making, or Generative AI are designed, deployed, or governed in IS contexts (e.g., business, healthcare, labor).

2. **IS Research on Information Security Behavior, Privacy Concerns, and Technology Adoption**  
   Covers behavioral theories (e.g., PMT, TPB), user responses to security policies, adoption of privacy-protecting technologies, and the paradox between personalization and privacy.

3. **User Behavior, Information Diffusion, and Community Dynamics in Social Media Platforms**  
   Studies related to user interactions, content sharing, social influence, misinformation, or platform governance (e.g., trolling, moderation, Q&A communities).

4. **Organizational-Level IS Strategy Research: Digital Transformation, IT Ambidexterity, and CIO Governance**  
   Research on digital innovation strategy, IT resource alignment, dynamic capabilities, and executive-level IT management.

5. **Data Analytics-Driven IS Performance and Value Creation for Organizations and Society**  
   Focus on the impact of analytics, big data, or data science on IS success, organizational performance, or public value, including systematic/meta analyses.

6. **Human–Computer Interaction, Interface Design, and User Experience (UX) in IS Research**  
   Includes studies on UI/UX design, cognitive aspects of interface use, and evaluation of digital agents, dashboards, or immersive systems (e.g., VR, fMRI studies).

7. **ICT Design for Digital Inclusion, Social Equity, and Access Among Marginalized Populations**  
   Explores how IS artifacts and ICTs support underserved users, promote fairness, or address digital divides (e.g., in fintech, education, governance).

8. **Digital Platform Business Models, User Engagement Mechanisms, and Market Transaction Design**  
   Focused on platform economics, pricing strategy, user-generated content, crowdfunding, and dual-sided market structure in digital platforms.

9. **Green IS, Corporate Social Responsibility, and Sustainable Information System Practices**  
   IS for sustainability, including carbon tracking systems, CSR impact, and design science contributions toward environmental or social responsibility.

10. **Theory Building, Methodological Innovation, and Meta-Level Scholarship in IS**  
    Includes literature reviews, methodological guidelines, theory synthesis, or frameworks for meta-theoretical advancement in the IS field.

## Output Format

Return a **JSON object** in this exact format:  
`{"category_name": X, 
   "rationale": "..."}`

Where,
- **category_number**: (integer) the number of the selected category (1-10) and the rationale is a brief explanation of why this category was chosen. Where `X` is a number from 1 to 10, 
- **rationale**: (string) a brief classification rationale, **no more than two sentences**, explaining why this category was chosen.

## Input Example

**Title:** Algorithm Sensemaking: How Platform Workers Make Sense of Algorithmic Management  
**Abstract:** This paper investigates how gig workers interpret algorithmic management by applying the sensemaking lens. Through a qualitative study of ride-hailing platforms, we examine how digital control mechanisms shape worker perceptions and strategic responses.

**Output:**  
```json
{
  "category_number": 3,
  "rationale": "The study apory to platform workers’ interpretations of algorithmic management, focusing on user behavior within a digital platform. It examines how control mechanisms influence perceptions and actions. This aligns most closely with social media and online community dynamics."
}
```

## Classification Rule

- Assign **only one** category.
- Choose the category that reflects the **primary theoretical or empirical contribution** of the study.
- If multiple categories are relevant, select the one with the **closest alignment to the core research question**.
- If the input does **not clearly fit any category**, return `{"category": 0}`.

Now classify the following input:

"""

input_prompt = """
# [System]
{system_prompt}

# [Input]
### **Title:** {title}
### **Abstract:** {abstract}
"""

prompt_template = PromptTemplate(
   template=input_prompt,
   input_variables=["system_prompt", "title", "abstract"]
)

prompt_template = prompt_template.partial(system_prompt=system_prompt)

In [37]:
# Prompt Template Testing
# Display the first paper's title and abstract for testing
# Approximately 1100 Tokens by One Paper

prompt = prompt_template.invoke(
    input={
        "title": papers_df.iloc[0]['title'],
        "abstract": papers_df.iloc[0]['abstract']
    }
)

display_markdown(prompt.text, raw=True)


# [System]

You are a classification assistant trained to identify the main research theme of academic papers published in the *Journal of the Association for Information Systems (JAIS)*.

You will be given the **title** and **abstract** of an IS (Information Systems) research paper.

Your task is to **assign exactly one of the following ten categories** that best represents the paper’s core theme. Each category is defined by a **specific scope of research problems, methods, or technological domain**.

## Category List

1. **AI- and Machine Learning-Driven Information System Design and Organizational Applications**  
   Research on how AI/ML technologies such as predictive models, algorithmic decision-making, or Generative AI are designed, deployed, or governed in IS contexts (e.g., business, healthcare, labor).

2. **IS Research on Information Security Behavior, Privacy Concerns, and Technology Adoption**  
   Covers behavioral theories (e.g., PMT, TPB), user responses to security policies, adoption of privacy-protecting technologies, and the paradox between personalization and privacy.

3. **User Behavior, Information Diffusion, and Community Dynamics in Social Media Platforms**  
   Studies related to user interactions, content sharing, social influence, misinformation, or platform governance (e.g., trolling, moderation, Q&A communities).

4. **Organizational-Level IS Strategy Research: Digital Transformation, IT Ambidexterity, and CIO Governance**  
   Research on digital innovation strategy, IT resource alignment, dynamic capabilities, and executive-level IT management.

5. **Data Analytics-Driven IS Performance and Value Creation for Organizations and Society**  
   Focus on the impact of analytics, big data, or data science on IS success, organizational performance, or public value, including systematic/meta analyses.

6. **Human–Computer Interaction, Interface Design, and User Experience (UX) in IS Research**  
   Includes studies on UI/UX design, cognitive aspects of interface use, and evaluation of digital agents, dashboards, or immersive systems (e.g., VR, fMRI studies).

7. **ICT Design for Digital Inclusion, Social Equity, and Access Among Marginalized Populations**  
   Explores how IS artifacts and ICTs support underserved users, promote fairness, or address digital divides (e.g., in fintech, education, governance).

8. **Digital Platform Business Models, User Engagement Mechanisms, and Market Transaction Design**  
   Focused on platform economics, pricing strategy, user-generated content, crowdfunding, and dual-sided market structure in digital platforms.

9. **Green IS, Corporate Social Responsibility, and Sustainable Information System Practices**  
   IS for sustainability, including carbon tracking systems, CSR impact, and design science contributions toward environmental or social responsibility.

10. **Theory Building, Methodological Innovation, and Meta-Level Scholarship in IS**  
    Includes literature reviews, methodological guidelines, theory synthesis, or frameworks for meta-theoretical advancement in the IS field.

## Output Format

Return a **JSON object** in this exact format:  
`{"category_name": X, 
   "rationale": "..."}`

Where,
- **category_number**: (integer) the number of the selected category (1-10) and the rationale is a brief explanation of why this category was chosen. Where `X` is a number from 1 to 10, 
- **rationale**: (string) a brief classification rationale, **no more than three sentences**, explaining why this category was chosen.

## Input Example

**Title:** Algorithm Sensemaking: How Platform Workers Make Sense of Algorithmic Management  
**Abstract:** This paper investigates how gig workers interpret algorithmic management by applying the sensemaking lens. Through a qualitative study of ride-hailing platforms, we examine how digital control mechanisms shape worker perceptions and strategic responses.

**Output:**  
```json
{
  "category_number": 3,
  "rationale": "The study apory to platform workers’ interpretations of algorithmic management, focusing on user behavior within a digital platform. It examines how control mechanisms influence perceptions and actions. This aligns most closely with social media and online community dynamics."
}
```

## Classification Rule

- Assign **only one** category.
- Choose the category that reflects the **primary theoretical or empirical contribution** of the study.
- If multiple categories are relevant, select the one with the **closest alignment to the core research question**.
- If the input does **not clearly fit any category**, return `{"category": 0}`.

Now classify the following input:



# [Input]
### **Title:** Explaining Persistent Ineffectiveness in Professional Online Communities: Multilevel Tensions and Misguided Coping Strategies
### **Abstract:** Abstract
Online communities (OCs) have become an increasingly prevalent way for organizations to bring people together to collaborate and create value. However, despite the abundance of extant literature, many studies still point to the lack of long-term sustainability of OCs. We contend that communities become dormant or obsolete over time because of manifestations of ineffectiveness a state of the community that hinders the attainment of individual and collective desired outcomes. While ineffectiveness in OCs is common, it is less apparent why such ineffectiveness persists. Two knowledge gaps are particularly significant here. First, while the multilevel nature of OCs is acknowledged, corresponding difficulties in aligning individual and collective interests and behaviors have often been neglected in past studies. Second, rare longitudinal studies have revealed that community members respond to ineffectiveness with various coping behaviors. However, the impact of these coping behaviors may not turn out as desired. Consequently, we investigate the persistence of ineffectiveness from the perspective of multilevel and coping effects, addressing the following research question: How and why does ineffectiveness persist in online communities? Our critical realist case study offers a three-step explanatory framework: (1) underlying multilevel tensions in the community contribute to usage ineffectiveness (i.e., members are unable to use the OC effectively); (2) misguided coping behaviors contribute to ineffective adaptation (i.e., members are unable to cope with not being able to use the OC effectively); and (3) ineffectiveness persists due to the interaction between usage and adaptation ineffectiveness.


## Model

In [38]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(
    model="gpt-4.1-mini-2025-04-14",
    temperature=0.0
)

In [25]:
# Model testing

# 프롬프트 포맷팅
formatted_prompt = prompt_template.format(
    title=papers_df.iloc[0]['title'],
    abstract=papers_df.iloc[0]['abstract']
)

# 모델에 직접 전달
model_response = model.invoke(formatted_prompt)
print(model_response.text)

<bound method BaseMessage.text of AIMessage(content='{"category": 3}', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 1144, 'total_tokens': 1151, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 1024}}, 'model_name': 'gpt-4.1-mini-2025-04-14', 'system_fingerprint': 'fp_38647f5e19', 'id': 'chatcmpl-BT4jlXd15B1j6uqD2U6u0G8IWWpvz', 'finish_reason': 'stop', 'logprobs': None}, id='run-8f8609ee-c4a4-4f94-85c9-5a6b0821ed92-0', usage_metadata={'input_tokens': 1144, 'output_tokens': 7, 'total_tokens': 1151, 'input_token_details': {'audio': 0, 'cache_read': 1024}, 'output_token_details': {'audio': 0, 'reasoning': 0}})>


## Output Parser

In [39]:
from langchain_core.output_parsers import JsonOutputParser

OutputParser = JsonOutputParser()

In [54]:
from langchain_core.runnables import RunnableLambda

Formatter = RunnableLambda(lambda x: {
    "category_number": x["category_number"],
    "rationale": x["rationale"]
})


## Chain

In [48]:
Classifier_Chain = (
    prompt_template
    | model
    | OutputParser
    | Formatter
)

In [51]:
papers_df.head()

Unnamed: 0,title,abstract,citation,url,year,volume,issue
0,Explaining Persistent Ineffectiveness in Profe...,Abstract\nOnline communities (OCs) have become...,"Recommended Citation\n\n\n Stein, Mari-Klar...",https://aisel.aisnet.org/jais//vol23/iss1/1/,2022,23,1
1,A Design Theory for Energy and Carbon Manageme...,Abstract\nEnergy and carbon management systems...,"Recommended Citation\n\n\n Zampou, Eleni; M...",https://aisel.aisnet.org/jais//vol23/iss1/2/,2022,23,1
2,Examining the Impacts of Airbnb Review Policy ...,"Abstract\nIn July 2014, Airbnb, one of the big...","Recommended Citation\n\n\n Mousavi, Reza an...",https://aisel.aisnet.org/jais//vol23/iss1/3/,2022,23,1
3,Inventing Together: The Role of Actor Goals an...,Abstract\nWith the ubiquity of the internet an...,"Recommended Citation\n\n\n Abhari, Kaveh; D...",https://aisel.aisnet.org/jais//vol23/iss1/4/,2022,23,1
4,The Design of a System for Online Psychosocial...,Abstract\nThe design of sensitive online healt...,"Recommended Citation\n\n\n Sjöström, Jonas;...",https://aisel.aisnet.org/jais//vol23/iss1/5/,2022,23,1


In [52]:
records = papers_df[["title", "abstract"]].to_dict(orient="records")
results = Classifier_Chain.batch(records, config={"max_concurrency": 10})
df_result = pd.DataFrame(results)
papers_df = pd.concat([papers_df, df_result], axis=1)

In [53]:
papers_df.head(5)

Unnamed: 0,title,abstract,citation,url,year,volume,issue,category_name,rationale
0,Explaining Persistent Ineffectiveness in Profe...,Abstract\nOnline communities (OCs) have become...,"Recommended Citation\n\n\n Stein, Mari-Klar...",https://aisel.aisnet.org/jais//vol23/iss1/1/,2022,23,1,3,The paper focuses on the dynamics within profe...
1,A Design Theory for Energy and Carbon Manageme...,Abstract\nEnergy and carbon management systems...,"Recommended Citation\n\n\n Zampou, Eleni; M...",https://aisel.aisnet.org/jais//vol23/iss1/2/,2022,23,1,9,The paper focuses on designing energy and carb...
2,Examining the Impacts of Airbnb Review Policy ...,"Abstract\nIn July 2014, Airbnb, one of the big...","Recommended Citation\n\n\n Mousavi, Reza an...",https://aisel.aisnet.org/jais//vol23/iss1/3/,2022,23,1,3,The paper studies user-generated content and b...
3,Inventing Together: The Role of Actor Goals an...,Abstract\nWith the ubiquity of the internet an...,"Recommended Citation\n\n\n Abhari, Kaveh; D...",https://aisel.aisnet.org/jais//vol23/iss1/4/,2022,23,1,8,The paper focuses on user engagement and parti...
4,The Design of a System for Online Psychosocial...,Abstract\nThe design of sensitive online healt...,"Recommended Citation\n\n\n Sjöström, Jonas;...",https://aisel.aisnet.org/jais//vol23/iss1/5/,2022,23,1,2,The paper focuses on designing an online healt...


In [55]:
papers_df.to_csv('JAIS_papers_classified.csv', index=False, encoding='utf-8-sig')

### Preprint분에 대해서 추가로 분류 작업 실시

In [63]:
preprint = pd.read_csv('JAIS_preprints.csv', encoding='utf-8-sig')
preprint

Unnamed: 0,title,abstract,citation,url,year,volume,issue
0,Data Control Coordination in the Formation of...,Abstract\nEcosystems in highly regulated secto...,"Recommended Citation\n\n Spagnoletti, Paolo...",https://aisel.aisnet.org/jais_preprints/167,2025,0,Preprint
1,What Is Augmented? A Meta-Narrative Review of ...,Abstract\nThe widespread implementation of art...,"Recommended Citation\n\n Baer, Inès; Waarde...",https://aisel.aisnet.org/jais_preprints/168,2025,0,Preprint
2,Computationally Intensive Research: Advancing...,Abstract\nThis paper draws attention to the po...,"Recommended Citation\n\n Mohajeri, Kaveh an...",https://aisel.aisnet.org/jais_preprints/169,2025,0,Preprint
3,Achieving Reward-Based Crowdfunding Project Su...,Abstract\nCrowdfunding project founders increa...,"Recommended Citation\n\n Frimpong, Bright; ...",https://aisel.aisnet.org/jais_preprints/170,2025,0,Preprint
4,Achieving Reward-Based Crowdfunding Project Su...,Abstract\nCrowdfunding project founders increa...,"Recommended Citation\n\n Frimpong, Bright; ...",https://aisel.aisnet.org/jais_preprints/171,2025,0,Preprint
5,Capturing the “Social” in Social Networks: The...,Abstract\nSocial networks are omnipresent in b...,"Recommended Citation\n\n Meske, Christian; ...",https://aisel.aisnet.org/jais_preprints/172,2025,0,Preprint
6,Polarization or Bias: Take Your Click on Soci...,Abstract\nA major policy concern that has emer...,"Recommended Citation\n\n Dey, Debabrata; La...",https://aisel.aisnet.org/jais_preprints/173,2025,0,Preprint
7,Polarization or Bias: Take Your Click on Soci...,Abstract\nA major policy concern that has emer...,"Recommended Citation\n\n Dey, Debabrata; La...",https://aisel.aisnet.org/jais_preprints/174,2025,0,Preprint
8,Empowering Marginalized Communities: A Framewo...,Abstract\nSocial inclusion—the ability to part...,"Recommended Citation\n\n Qureshi, Israr; Bh...",https://aisel.aisnet.org/jais_preprints/175,2025,0,Preprint
9,Corporate Nomads: Working at the Boundary Betw...,Abstract\nDigital nomads are knowledge workers...,"Recommended Citation\n\n Marx, Julian; Mirb...",https://aisel.aisnet.org/jais_preprints/176,2025,0,Preprint


In [64]:
records = preprint[["title", "abstract"]].to_dict(orient="records")
results = Classifier_Chain.batch(records, config={"max_concurrency": 10})
df_result = pd.DataFrame(results)
preprint = pd.concat([preprint, df_result], axis=1)
preprint[['title', 'category_name', 'rationale']].head(5)

Unnamed: 0,title,category_name,rationale
0,Data Control Coordination in the Formation of...,4,The paper focuses on organizational-level coor...
1,What Is Augmented? A Meta-Narrative Review of ...,1,The paper focuses on AI-based augmentation and...
2,Computationally Intensive Research: Advancing...,10,The paper focuses on methodological innovation...
3,Achieving Reward-Based Crowdfunding Project Su...,8,The paper focuses on reward-based crowdfunding...
4,Achieving Reward-Based Crowdfunding Project Su...,8,The paper focuses on reward-based crowdfunding...


In [66]:
papers_df = pd.concat([papers_df, preprint], axis=0, ignore_index=True)
papers_df

Unnamed: 0,title,abstract,citation,url,year,volume,issue,category_name,rationale
0,Explaining Persistent Ineffectiveness in Profe...,Abstract\nOnline communities (OCs) have become...,"Recommended Citation\n\n\n Stein, Mari-Klar...",https://aisel.aisnet.org/jais//vol23/iss1/1/,2022,23,1,3,The paper focuses on the dynamics within profe...
1,A Design Theory for Energy and Carbon Manageme...,Abstract\nEnergy and carbon management systems...,"Recommended Citation\n\n\n Zampou, Eleni; M...",https://aisel.aisnet.org/jais//vol23/iss1/2/,2022,23,1,9,The paper focuses on designing energy and carb...
2,Examining the Impacts of Airbnb Review Policy ...,"Abstract\nIn July 2014, Airbnb, one of the big...","Recommended Citation\n\n\n Mousavi, Reza an...",https://aisel.aisnet.org/jais//vol23/iss1/3/,2022,23,1,3,The paper studies user-generated content and b...
3,Inventing Together: The Role of Actor Goals an...,Abstract\nWith the ubiquity of the internet an...,"Recommended Citation\n\n\n Abhari, Kaveh; D...",https://aisel.aisnet.org/jais//vol23/iss1/4/,2022,23,1,8,The paper focuses on user engagement and parti...
4,The Design of a System for Online Psychosocial...,Abstract\nThe design of sensitive online healt...,"Recommended Citation\n\n\n Sjöström, Jonas;...",https://aisel.aisnet.org/jais//vol23/iss1/5/,2022,23,1,2,The paper focuses on designing an online healt...
...,...,...,...,...,...,...,...,...,...
165,Polarization or Bias: Take Your Click on Soci...,Abstract\nA major policy concern that has emer...,"Recommended Citation\n\n Dey, Debabrata; La...",https://aisel.aisnet.org/jais_preprints/174,2025,0,Preprint,8,The paper focuses on social media platforms' p...
166,Empowering Marginalized Communities: A Framewo...,Abstract\nSocial inclusion—the ability to part...,"Recommended Citation\n\n Qureshi, Israr; Bh...",https://aisel.aisnet.org/jais_preprints/175,2025,0,Preprint,7,The paper focuses on how information systems c...
167,Corporate Nomads: Working at the Boundary Betw...,Abstract\nDigital nomads are knowledge workers...,"Recommended Citation\n\n Marx, Julian; Mirb...",https://aisel.aisnet.org/jais_preprints/176,2025,0,Preprint,4,The paper focuses on the organizational-level ...
168,Why Do We Follow Virtual Influencer Recommenda...,Abstract\nVirtual influencers (VI) have receiv...,"Recommended Citation\n\n Nissen, Anika; Con...",https://aisel.aisnet.org/jais_preprints/177,2025,0,Preprint,6,The study focuses on human cognitive and emoti...


In [68]:
papers_df.to_csv('JAIS_papers_classified.csv', index=False, encoding='utf-8-sig')

# 작업지시
- 연도별로 연구 카테고리 시각화
    1. 연도별, 카테고리 수 Barplot 시각화 (20)
    2. 카테고리별, 연도에 따른 숫자 Lineplot 시각화
- 연구 트렌드를 볼 수 있는 추가적인 분석이 필요한지 검토
    1. 추가로 필요하다고 판단된 분석 및 데이터 전처리 수행
    2. 추가 데이터 분석 및 시각화 수행
- 연구 트렌드 변화를 분석한 보고서를 작성
   
# 주의 사항
- **모든 시각화는 영어로 수행할 것**
- **분석 결과(보고서) 출력은 한국어로 수행할 것**
- **2025년은 
