Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support of clustering plot (2D UMAP) #584

Closed
karelin opened this issue Jun 27, 2022 · 34 comments
Closed

Support of clustering plot (2D UMAP) #584

karelin opened this issue Jun 27, 2022 · 34 comments

Comments

@karelin
Copy link

karelin commented Jun 27, 2022

Hi there,
Just wandering, if the current version of BERTopic supports 2D UMAP plot with clustering, like first plot in original post https://towardsdatascience.com/topic-modeling-with-bert-779f7db187e6

Didn't find such plot in documentation, but it could be rather useful in analysis of document collection.

@drob-xx
Copy link

drob-xx commented Jun 27, 2022

I think that the answer is that BERTopic doesn't 'support' this particular visualization, but it is relatively easy to do on your own. What you need is a 2D representation of the embeddings. The simplest way to do this is to do a 2D reduction on the saved UMAP embeddings within your current model so something like:

2D_UMAP = umap.UMAP(MyBERTopicModel.umap_model.embedding_)

Then you can use the output for x, y coordinates for a scatter plot. The above reduction is not going to be very pretty however - because it is a 2D UMAP reduction of the 5D UMAP reduction of the original embeddings. You can get a 'nicer' looking scatter by either creating a TSNE 2D from the umap_model.embedding_ like above - but with TSNE the downside being that it takes longer than UMAP. Alternatively you can get the original embeddings and UMAP reduce down to 2D the way that Maarten did in the original Medium article. Not sure if any of this is helpful.

I totally agree that plotting out the embeddings is very useful. It has fundamentally altered how I understand BERTopic. If you want code to do some of the above, you can refer to a github repo I put together as part of the discussion at #582. Hope this is helpful and not too in the weeds.

@karelin
Copy link
Author

karelin commented Jun 27, 2022

Hey Dan!
Thank you very much. I think on using two UMAP transforms (ND -> 5D + ND -> 2D) then.

@MaartenGr
Copy link
Owner

@karelin You are in luck! I am almost finished with a function called .visualize_documents() that allows you to visualize the documents interactively, with options for optimizing the output since plotting potentially millions of points can be troublesome. I intend to push it to the currently open PR somewhere this week so you can try it out. In the meantime, thanks to @drob-xx for sharing your code to get started creating your own!

@karelin
Copy link
Author

karelin commented Jun 28, 2022

@MaartenGr Awesome!
Could you post here when PR will be ready?

@MaartenGr
Copy link
Owner

@karelin The PR is still currently in the works but I just implemented the .visualize_documents feature for you to try out. You can find the documentation and instructions here and you can already install the PR with:

pip install --upgrade git+https://github.com/MaartenGr/BERTopic.git@refs/pull/578/merge` 

Doing so allows you to try it out before the release of the new version. The official release most likely will take a couple more weeks but I will let you know when it is ready!

@doubianimehdi
Copy link

@karelin The PR is still currently in the works but I just implemented the .visualize_documents feature for you to try out. You can find the documentation and instructions here and you can already install the PR with:

pip install --upgrade git+https://github.com/MaartenGr/BERTopic.git@refs/pull/578/merge` 

Doing so allows you to try it out before the release of the new version. The official release most likely will take a couple more weeks but I will let you know when it is ready!

Hi ! I've tried the command above to install the branch but it didn't work ... when I do a pip list bertopic is still in 0.10.0 version ?

@doubianimehdi
Copy link

I also have this warning WARNING: Did not find branch or tag 'refs/pull/578/merge', assuming revision or ref.

@MaartenGr
Copy link
Owner

It seems that there was a character at the end of the link that should have been removed. The install should be as follows:

pip install git+https://github.com/MaartenGr/BERTopic.git@refs/pull/578/merge

After doing so, you can test it by running something like the following to see if you now have the new features:

from bertopic import BERTopic
from sklearn.datasets import fetch_20newsgroups

docs = fetch_20newsgroups(subset='all',  remove=('headers', 'footers', 'quotes'))["data"]
topic_model = BERTopic(verbose=True)
topics, probs = topic_model.fit_transform(docs)

hierarchical_topics = topic_model.hierarchical_topics(docs, topics)

@doubianimehdi
Copy link

Thank you but even without the character, it's not working ...

here 's the output :

Collecting git+https://github.com/MaartenGr/BERTopic.git@refs/pull/578/merge
Cloning https://github.com/MaartenGr/BERTopic.git (to revision refs/pull/578/merge) to c:\users\doub2420\appdata\local\temp\pip-req-build-0bcznlrk
Running command git clone --filter=blob:none --quiet https://github.com/MaartenGr/BERTopic.git 'C:\Users\doub2420\AppData\Local\Temp\pip-req-build-0bcznlrk'
WARNING: Did not find branch or tag 'refs/pull/578/merge', assuming revision or ref.
Running command git fetch -q https://github.com/MaartenGr/BERTopic.git refs/pull/578/merge
Running command git checkout -q 2bcc9ea
Resolved https://github.com/MaartenGr/BERTopic.git to commit 2bcc9ea
Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy>=1.20.0 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from bertopic==0.10.0) (1.21.6)
Requirement already satisfied: hdbscan>=0.8.28 in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from bertopic==0.10.0) (0.8.28)
Requirement already satisfied: umap-learn>=0.5.0 in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from bertopic==0.10.0) (0.5.3)
Requirement already satisfied: pandas>=1.1.5 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from bertopic==0.10.0) (1.4.2)
Requirement already satisfied: scikit-learn>=0.22.2.post1 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from bertopic==0.10.0) (0.24.2)
Requirement already satisfied: tqdm>=4.41.1 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from bertopic==0.10.0) (4.64.0)
Requirement already satisfied: sentence-transformers>=0.4.1 in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from bertopic==0.10.0) (2.2.0)
Requirement already satisfied: plotly>=4.7.0 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from bertopic==0.10.0) (5.7.0)
Requirement already satisfied: pyyaml<6.0 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from bertopic==0.10.0) (5.4.1)
Requirement already satisfied: scipy>=1.0 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from hdbscan>=0.8.28->bertopic==0.10.0) (1.8.0)
Requirement already satisfied: joblib>=1.0 in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from hdbscan>=0.8.28->bertopic==0.10.0) (1.1.0)
Requirement already satisfied: cython>=0.27 in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from hdbscan>=0.8.28->bertopic==0.10.0) (0.29.28)
Requirement already satisfied: pytz>=2020.1 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from pandas>=1.1.5->bertopic==0.10.0) (2022.1)
Requirement already satisfied: python-dateutil>=2.8.1 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from pandas>=1.1.5->bertopic==0.10.0) (2.8.2)
Requirement already satisfied: six in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from plotly>=4.7.0->bertopic==0.10.0) (1.16.0)
Requirement already satisfied: tenacity>=6.2.0 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from plotly>=4.7.0->bertopic==0.10.0) (8.0.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from scikit-learn>=0.22.2.post1->bertopic==0.10.0) (3.1.0)
Requirement already satisfied: transformers<5.0.0,>=4.6.0 in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from sentence-transformers>=0.4.1->bertopic==0.10.0) (4.18.0)
Requirement already satisfied: torch>=1.6.0 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from sentence-transformers>=0.4.1->bertopic==0.10.0) (1.11.0)
Requirement already satisfied: torchvision in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from sentence-transformers>=0.4.1->bertopic==0.10.0) (0.12.0)
Requirement already satisfied: nltk in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from sentence-transformers>=0.4.1->bertopic==0.10.0) (3.7)
Requirement already satisfied: sentencepiece in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from sentence-transformers>=0.4.1->bertopic==0.10.0) (0.1.96)
Requirement already satisfied: huggingface-hub in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from sentence-transformers>=0.4.1->bertopic==0.10.0) (0.5.1)
Requirement already satisfied: colorama in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from tqdm>=4.41.1->bertopic==0.10.0) (0.4.3)
Requirement already satisfied: numba>=0.49 in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from umap-learn>=0.5.0->bertopic==0.10.0) (0.55.1)
Requirement already satisfied: pynndescent>=0.5 in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from umap-learn>=0.5.0->bertopic==0.10.0) (0.5.6)
Requirement already satisfied: setuptools in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from numba>=0.49->umap-learn>=0.5.0->bertopic==0.10.0) (58.1.0)
Requirement already satisfied: llvmlite<0.39,>=0.38.0rc1 in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from numba>=0.49->umap-learn>=0.5.0->bertopic==0.10.0) (0.38.0)
Requirement already satisfied: typing-extensions in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from torch>=1.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (4.2.0)
Requirement already satisfied: requests in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (2.27.1)
Requirement already satisfied: filelock in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (3.6.0)
Requirement already satisfied: sacremoses in c:\users\doub2420\appdata\roaming\python\python39\site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (0.0.50)
Requirement already satisfied: regex!=2019.12.17 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (2022.4.24)
Requirement already satisfied: tokenizers!=0.11.3,<0.13,>=0.11.1 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (0.12.1)
Requirement already satisfied: packaging>=20.0 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (21.3)
Requirement already satisfied: click in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from nltk->sentence-transformers>=0.4.1->bertopic==0.10.0) (8.0.0)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from torchvision->sentence-transformers>=0.4.1->bertopic==0.10.0) (9.1.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from packaging>=20.0->transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (3.0.8)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (1.26.9)
Requirement already satisfied: idna<4,>=2.5 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (3.3)
Requirement already satisfied: charset-normalizer~=2.0.0 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (2.0.12)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\doub2420\appdata\local\programs\python\python39\lib\site-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers>=0.4.1->bertopic==0.10.0) (2021.10.8)

@doubianimehdi
Copy link

and when I try hierarchical_topics = topic_model.hierarchical_topics(abstract, topics)


AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_14196/2730210245.py in
----> 1 hierarchical_topics = topic_model.hierarchical_topics(abstract, topics)

AttributeError: 'BERTopic' object has no attribute 'hierarchical_topics'

@MaartenGr
Copy link
Owner

I would advise starting from a completely fresh environment and then installing BERTopic via de link provided instead. Then, after installing, make sure to restart the notebook that you are working in.

@doubianimehdi
Copy link

I've been able to test the features and I have a request :
Would that be possible to display the text with carriage returns and also if we have a URL in the data could make the data point clickable and open the URL ? besides that it's seems exactly what i've been looking for !

Thank you again for your AMAZING work !

@doubianimehdi
Copy link

And also on the hierarchical visualization, we don't see the text on hover and we can't click on it either like the non-hierarchical one

@MaartenGr
Copy link
Owner

Would that be possible to display the text with carriage returns

I believe that Plotly does not generate newlines on either carriage returns or line feeds. What might work is using <br> instead but in my experience Plotly's go.Scattergl does not behave entirely the same as the regular scatterplots, so there is a chance that it will not work.

if we have a URL in the data could make the data point clickable and open the URL

I just checked the Plotly documentation and from what I can tell this is unfortunately not possible in their current API.

And also on the hierarchical visualization, we don't see the text on hover and we can't click on it either like the non-hierarchical one

Strange, for me the following is working without any problems:

from bertopic import BERTopic
from sklearn.datasets import fetch_20newsgroups

docs = fetch_20newsgroups(subset='all',  remove=('headers', 'footers', 'quotes'))["data"]
topic_model = BERTopic(verbose=True)
topics, probs = topic_model.fit_transform(docs)

hierarchical_topics = topic_model.hierarchical_topics(docs, topics)

Then, visualize the hierarchy with hover:

topic_model.visualize_hierarchy(hierarchical_topics=hierarchical_topics)

Could you share the code you have been using to get the hierarchical visualization?

@doubianimehdi
Copy link

doubianimehdi commented Jul 6, 2022

I meant this function;:

Run the visualization with the original embeddings
topic_model.visualize_hierarchical_documents(abstract, hierarchical_topics, embeddings=embeddings)

hovering doesn't work like in :
Run the visualization with the original embeddings
topic_model.visualize_documents(abstract, embeddings=embeddings)

as for the hover and clickable URL , in the Doc2Map package he used this :

def plotly_interactive_map(self, G=None, root=None):

    def cluster(node, lLeaf, image=True):
        
        
        fig = go.Figure(go.Scatter(
            y = [self.lDocEmbedding2D[i,0] for i in lLeaf],
            x = [self.lDocEmbedding2D[i,1] for i in lLeaf],
            mode = 'markers',
            #marker = {"size": 0.7}
            customdata=[([data["URL"]], [data["label"]]) for data in self.lData],
            hovertemplate=(
                "Label: <b>%{customdata[1]}</b><br>"+
                "URL: %{customdata[0]}"+
                "<extra></extra>")
        ))

I believe that customdata and hovertemplate could be used in similar manner in scattergl (https://plotly.github.io/plotly.py-docs/generated/plotly.graph_objects.Scattergl.html)
It mentions the customdata and hovertemplate too ...

Or maybe i'm completely wrong, I've never built python packages before ...

Thanks again !

@MaartenGr
Copy link
Owner

topic_model.visualize_hierarchical_documents(abstract, hierarchical_topics, embeddings=embeddings)

That is correct, hovering is turned off by default as you risk memory errors by loading in so many documents. The following should do the trick:

topic_model.visualize_hierarchical_documents(
    abstract, 
    hierarchical_topics, 
    embeddings=embeddings, 
    hide_document_hover=False
)

There are quite a few parameters that you can find in the visualization functions. Going through the docstrings should help quite a bit.

I believe that customdata and hovertemplate could be used in similar manner in scattergl (https://plotly.github.io/plotly.py-docs/generated/plotly.graph_objects.Scattergl.html)
It mentions the customdata and hovertemplate too ...

Unfortunately this is not possible at the moment as go.Scattergl has a few issues generating the same hovers as the regular go.Scatter. See this issue for example.

@doubianimehdi
Copy link

Thank you !
Too bad for Scattergl ... what effort would it take to modify it to use scatter instead and have the URLs ? Just to see what it would take !

Thanks so much again !!!

@MaartenGr
Copy link
Owner

@doubianimehdi go.Scattergl is necessary for scalability. Plotly can have issues visualizing thousands of points, let alone millions. For that reason, we need something that can handle that a bit better than the regular go.Scatter. If the hover issue gets fixed in Plotly, I'll make sure to implement it in BERTopic!

@MaartenGr
Copy link
Owner

Seeing this was implemented in v0.11, I will close this issue for now. Feel free to ping me if you want to continue this discussion.

@doubianimehdi
Copy link

@MaartenGr Hi ! Thank you for your wonderful work ! I was getting back to this implementation because I wanted to do a visualization similar to this : https://get.carrotsearch.com/foamtree/latest/demos/large.html

But for that I have to use this : https://get.carrotsearch.com/foamtree/latest/api/

I'm not a front end man at all ... unfortunately ... I was wondering if you or some talented member of this community, could do this or help to do this ?

Thank you so much again !

@MaartenGr
Copy link
Owner

@doubianimehdi If you want to keep it straightforward, then you can also use plotly for this as it has implemented Treemaps. Other than that, I am not familiar with carrotsearch unfortunately.

@doubianimehdi
Copy link

@MaartenGr Thanks ! That's what I was thinking for my Proof of Concept phase ... but later the beautiful interface of carrotsearch would be a good addition to my final product !

@doubianimehdi
Copy link

@MaartenGr I'm having a hard time seeing how I can use the hierarchical topics dataframe to adapt it to a treemap ... could you give me some clue to achieve this ? Thank you !

@MaartenGr
Copy link
Owner

@doubianimehdi No problem, it is just a few lines of code to get this working:

# Prepare children
children_left = (hierarchical_topics.Child_Left_ID + "_" + hierarchical_topics.Child_Left_Name).tolist()
children_right = (hierarchical_topics.Child_Right_ID + "_" + hierarchical_topics.Child_Right_Name).tolist()
children = children_left + children_right

# Prepare parents
parents = (hierarchical_topics.Parent_ID + "_" + hierarchical_topics.Parent_Name).tolist()
parents = parents + parents

# Plot treemap
import plotly.express as px
fig = px.treemap(names = children, parents = parents)
fig.update_traces(root_color="lightgrey")
fig.update_layout(margin = dict(t=50, l=25, r=25, b=25))
fig.show()

@doubianimehdi
Copy link

@MaartenGr Thank you ! I want to go further and make the full hierarchy with a slider to navigate through the level of topics ... what you it take to do it ?

@MaartenGr
Copy link
Owner

@doubianimehdi I am not sure whether something like that is possible. You would have to dive into the source code of plotly I think.

@doubianimehdi
Copy link

@MaartenGr https://towardsdatascience.com/make-a-treemap-in-python-426cee6ee9b8 it's possible but the structure of hierarchical topics is confusing to me ... i'm having a hard time translating it ...

@MaartenGr
Copy link
Owner

@doubianimehdi If you follow along with that tutorial and use the code I shared above, I think it might be possible. You would have to try some things out yourself first. Do note though that the widget is jupyter-specific module and not part of plotly.

@doubianimehdi
Copy link

Hi @MaartenGr I've done some tests ... i'm almost there but I can't wrap my head around something :
Here is the dataframe i'm using :

Name Level Parent_ID Num_Documents
2 pt catalyst_pt catalysts_electrocatalysts_tio2_electrocatalyst 9 50
1 pt catalysts_catalysts_membrane fuel_catalyst_electrocatalysts 9 50
11 blend membranes_methanol permeability_membranes proton_methanol fuel_hybrid membranes 7 51
0 nanocomposite membranes_composite membrane_composite membranes_membrane fuel_membranes proton 7 51
18 membrane fuel_membranes fuel_nafion membrane_nafion membranes_methanol fuel 6 52
15 membrane fuel_membranes fuel_membranes pems_membrane materials_membrane pem 6 52
4 membrane fuel_pem fuel_pemfc_membrane pem_fuel cell 9 53
3 model pemfc_pem fuel_pemfc model_pemfcs_pemfc 9 53
50 pt catalyst_pt catalysts_electrocatalysts_membrane fuel_catalysts 8 54
7 pem fuel_membrane fuel_electrodeposition_pemfcs_pemfc 8 54
10 pt catalyst_pt catalysts_graphene oxide_membrane fuel_electrocatalysts 7 55
54 pt catalyst_pt catalysts_membrane fuel_pt nanoparticles_cells pemfcs 7 55
29 pemfc_pemfcs_pem fuel_cells pemfcs_membrane fuel 6 56
30 membrane fuel_pem fuel_porous layer_pemfc_pemfcs 6 56
13 membrane fuel_electrolyzer_electrolyser_electrochemical impedance_pem fuel 8 57
53 membrane fuel_pem fuel_pemfcs_pemfc stack_pemfc 8 57
5 pemfc_pem fuel_membrane fuel_cell pemfc_fuel cell 6 58
6 pemfc_pemfcs_cell pemfc_membrane fuel_pem fuel 6 58
58 pemfc_pemfcs_membrane fuel_pem fuel_cell pemfc 5 59
24 membrane fuel_pem fuel_pressure drop_flow pressure_gas flow 5 59
16 membrane fuel_porous media_pem fuel_pemfc_porous 5 60
56 pem fuel_pemfc_pemfcs_membrane fuel_cell pemfc 5 60
23 fuel cells_fuel cell_membrane fuel_hydrogen fuel_pem fuel 7 61
34 fuel cell_hydrogen fuel_fuel cells_membrane fuel_pem fuel 7 61
9 multiblock copolymers_copolymers_block copolymers_copolymer_polymer 6 62
51 nanocomposite membranes_membrane fuel_composite membranes_composite membrane_membranes proton 6 62
20 membranes_poly vinylidene_exchange membranes_sulfonic acid_vinylidene fluoride 5 63
62 membrane fuel_composite membranes_composite membrane_membranes proton_proton conductivities 5 63
21 energy exergy_pemfc_membrane fuel_pemfc stack_fuel cell 7 64
19 power hydrogen_solar energy_hydrogen production_electrolyzer_pem electrolyzer 7 64
8 anode_carbon corrosion_electrochemical_membrane fuel_pem fuel 8 65
17 anode catalyst_membrane fuel_cathode catalyst_anode_pemfcs 8 65
33 hydrogen production_co oxidation_catalysts_co hydrogen_co2 6 66
28 hydrogen production_steam reformer_steam reforming_methanol steam_membrane fuel 6 66
32 membrane fuel_fuel cell_fuel cells_pem fuel_anode 7 67
57 pem fuel_pemfcs_pemfc_membrane fuel_fuel cell 7 67
36 membrane fuel_pem fuel_anodes_fuel cell_fuel cells 7 68
65 membrane fuel_pem fuel_anode_cathode catalyst_pemfcs 7 68
46 de energia_rendimiento_energia_para la_celulas combustivel 5 69
47 dans les_dans le_pour les_dans la_materiaux 5 69
52 membrane fuel_membranes fuel_membrane pem_nafion membrane_nafion membranes 5 70
25 membranes_exchange membrane_exchange membranes_membrane_proton conductivity 5 70
27 carbon composite_pemfc_pem fuel_membrane fuel_graphite 7 71
12 corrosion behavior_corrosion density_corrosion resistance_stainless steel_cathodic 7 71
22 electrolysis hydrogen_electrolyzer_water electrolysis_electrolysis water_membrane electrolysis 6 72
26 electrolyzers_water electrolyzers_water electrolysis_electrolyzer_electrolysis 6 72
39 nanocomposite membranes_membranes pems_nafion membrane_graphene oxide_composite membranes 4 73
37 nanofiber composite_nanofiber_nanofibers_electrospun nanofiber_electrospun nanofibers 4 73
35 pemfc_cooled fuel_pem fuel_pemfc stack_heat flux 6 74
67 pem fuel_membrane fuel_pemfcs_pemfc_fuel cell 6 74
70 membrane fuel_membranes fuel_nafion membrane_nafion membranes_pemfc 4 75
63 membrane fuel_membranes proton_composite membranes_composite membrane_membranes 4 75
31 membrane fuel_membrane electrode_membrane_exchange membrane_fuel cell 9 76
42 membrane material_membrane proton_exchange membrane_sulfonic acid_membrane 9 76
64 hydrogen production_energy exergy_electrolyzer_pem electrolyzer_electrolyser 6 77
61 fuel cell_fuel cells_membrane fuel_hydrogen fuel_electrical energy 6 77
74 membrane fuel_pem fuel_pemfcs_pemfc_fuel cell 5 78
38 dc converter_fuel cell_converter_fuel cells_membrane fuel 5 78
73 membranes pems_nanofiber_nanofibers_membrane fuel_composite membranes 3 79
75 membrane fuel_composite membranes_composite membrane_membranes proton_pems 3 79
68 membrane fuel_anode_cathode catalyst_pem fuel_cells pemfcs 6 80
55 pt catalyst_pt catalysts_electrocatalysts_carbon supported_membrane fuel 6 80
44 membrane fuel_electrochemical environment_temperature voltage_fuel cell_electrolysis cell 5 81
40 porous silicon_membrane fuel_silicon membrane_membraneless fuel_mems fuel 5 81
59 pemfc_membrane fuel_pem fuel_cell pemfc_fuel cell 4 82
60 membrane fuel_porous layer_pem fuel_pemfc_pemfcs 4 82
78 membrane fuel_pem fuel_pemfcs_pemfc_pemfc stack 4 83
81 membrane fuel_mems fuel_silicon membrane_fuel cell_membraneless fuel 4 83
41 viscoplastic_mechanical properties_membrane fuel_stress strain_mechanical durability 9 84
14 anode_microbial fuel_membrane microbial_anode chamber_anode cathode 9 84
66 hydrogen production_hydrogen gas_membrane fuel_steam reforming_methanol steam 5 85
77 hydrogen fuel_hydrogen storage_hydrogen production_fuel cell_fuel cells 5 85
76 membrane fuel_membrane electrode_membrane_exchange membrane_membrane invention 8 86
84 anode_microbial fuel_anode chamber_membrane microbial_anode cathode 8 86
79 membrane fuel_composite membranes_composite membrane_nanocomposite_membranes proton 2 87
45 membrane vanadium_nafion membrane_membranes vanadium_vanadium redox_vanadium permeability 2 87
72 water electrolyzers_water electrolysis_water electrolyzer_electrolyzer_electrolyzers 5 88
80 pt catalyst_pt catalysts_membrane fuel_catalyst layer_electrocatalysts 5 88
86 microbial fuel_anode_anode chamber_membrane fuel_microbial 7 89
49 references_fuel cell_fuel cells_fuels 14_figures xi 7 89
48 실리카 나노_연료전지 스택의_생물전기화학적 수소_고분자 전해질_전해질 연료전지 6 90
43 menghasilkan_yang lebih_menunjukkan_menunjukkan bahwa_dan tegangan 6 90
89 anode_microbial fuel_anode chamber_membrane fuel_cod removal 6 91
71 corrosion density_corrosion resistance_steel bipolar_pemfc_pemfcs 6 91
82 membrane fuel_pem fuel_pemfc_pemfcs_fuel cell 3 92
83 pem fuel_membrane fuel_pemfcs_pemfc_pemfc stack 3 92
90 menghasilkan_yang lebih_menunjukkan_menunjukkan bahwa_dan tegangan 5 93
91 pemfc_anode_corrosion resistance_stainless steel_steel bipolar 5 93
85 energy exergy_hydrogen production_hydrogen storage_hydrogen fuel_electrolyzer 4 94
88 pt catalyst_pt catalysts_catalyst layer_catalysts_electrocatalysts 4 94
69 rendimiento_energia_combustible de_para la_de combustible 4 95
93 pemfc_corrosion resistance_anode_membrane fuel_electrochemical 4 95
94 pem fuel_membrane fuel_membrane pem_pemfcs_pemfc 3 96
95 pemfc_membrane fuel_anode_electrochemical_corrosion resistance 3 96
96 membrane fuel_pem fuel_electrolysis_membrane pem_pemfcs 2 97
92 membrane fuel_pem fuel_pemfc_pemfcs_membrane pem 2 97
97 membrane fuel_pem fuel_pemfcs_pemfc_membrane pem 1 98
87 membrane fuel_composite membranes_composite membrane_nanocomposite_membranes proton 1 98
98 membrane fuel_pem fuel_fuel cell_membrane pem_pemfc 0

Then i'm using this snippet of code :
`import plotly.graph_objs as go

def generate_treemap(level):
return go.Figure(
go.Treemap(
labels=tree_df['Name'],
ids=tree_df['ID'],
parents=tree_df['Parent_ID'],
customdata=tree_df['Num_Documents'],
text=tree_df['Level'].apply(lambda x: '' if x > level else None),
hovertemplate="%{label}
ID: %{id}
Num Documents: %{customdata}",
visible=level == 0,
)
)

max_level = tree_df['Level'].max()
figures = [generate_treemap(level) for level in range(max_level + 1)]

fig = go.Figure(figures[0])

for level in range(1, max_level + 1):
fig.add_trace(figures[level]['data'][0])

steps = []
for index in range(max_level + 1):
step = dict(
method="restyle",
args=["visible", [False] * (max_level + 1)],
label=str(index)
)
step["args"][1][index] = True
steps.append(step)

sliders = [dict(
active=0,
currentvalue={"prefix": "Level: "},
pad={"t": 20},
steps=steps
)]

fig.update_layout(sliders=sliders)
fig.show()`

It works but the slider is not making the nesting and level change ...

Can you help ?

@MaartenGr
Copy link
Owner

I am not entirely sure but based on the Plotly documentation it seems that you will have to do an "update" method and not a "restyle" method. I would advise following the example linked to the Plotly documentation and replacing it with the treemap. Having said that, the link you provided shows an example with ipywidgets and the code you shared is a slider with plotly, so I am not sure whether the latter works with treemaps.

@doubianimehdi
Copy link

I DID IT !
`import plotly.graph_objects as go
import pandas as pd

def create_treemap_data(level):
mask = tree_df['Level'] <= level
return go.Treemap(
labels=tree_df.loc[mask, 'Name'],
ids=tree_df.loc[mask, 'ID'],
parents=tree_df.loc[mask, 'Parent_ID'],
customdata=tree_df.loc[mask, 'Num_Documents'],
hovertemplate="%{label}
ID: %{id}
Num Documents: %{customdata}",
)

max_level = tree_df['Level'].max()

fig_dict = {
"data": [create_treemap_data(0)],
"layout": {},
"frames": [],
}

Create frames for each level

for level in range(1, max_level + 1):
frame = {"data": [create_treemap_data(level)], "name": str(level)}
fig_dict["frames"].append(frame)

Create slider steps

steps = []
for level in range(max_level + 1):
step = {"args": [
[str(level)],
{"frame": {"duration": 300, "redraw": True},
"mode": "immediate",
"transition": {"duration": 300}}],
"label": str(level),
"method": "animate"}
steps.append(step)

Configure slider

sliders = [{"active": 0, "steps": steps, "x": 0.1, "y": 0, "len": 0.9}]

Add slider to layout

fig_dict["layout"]["sliders"] = sliders

Create figure from the dictionary

fig = go.Figure(fig_dict)

fig.show()`

It works because it handles the transition and animation when you move the slider

@MaartenGr
Copy link
Owner

Great! Glad to hear that you found the solution.

@doubianimehdi
Copy link

@MaartenGr thank you for your help ! that would be great to have a visualization like that in bertopic :)

@MaartenGr
Copy link
Owner

@doubianimehdi I cannot make any promises as I do not want to depend too much on plotly since it might be replaced in the future with a different plotting library but I definitely keep it in mind!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants