Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Connect to HDFS for Tiered Storage Setup #63941

Open
avj-vaibhav opened this issue May 16, 2024 · 7 comments
Open

Unable to Connect to HDFS for Tiered Storage Setup #63941

avj-vaibhav opened this issue May 16, 2024 · 7 comments
Labels
question Question? st-wontfix Known issue, no plans to fix it currenlty

Comments

@avj-vaibhav
Copy link

avj-vaibhav commented May 16, 2024

Issue: Unable to Connect to HDFS for Tiered Storage Setup

Description

I am trying to implement a tiered storage setup using ClickHouse's External Storage with HDFS. I have configured the storage_config.xml in /etc/clickhouse/config.d/ as follows:

<clickhouse>
    <storage_configuration>
        <disks>
            <hdfs>
                <type>hdfs</type>
                <endpoint>hdfs://localhost:8020/clickhouse/</endpoint>
                <skip_access_check>true</skip_access_check>
            </hdfs>
            <ssd>
                <type>local</type>
                <path>/</path>
            </ssd>
        </disks>
        <policies>
            <tiered_storage>
                <volumes>
                    <hot>
                        <disk>ssd</disk>
                    </hot>
                    <cold>
                        <disk>hdfs</disk>
                    </cold>
                </volumes>
            </tiered_storage>
        </policies>
    </storage_configuration>
</clickhouse>

Upon saving this configuration and restarting the ClickHouse server, I encounter the following error:

2024.05.16 11:03:32.836279 [ 94636 ] {} <Error> Application: Code: 660. DB::Exception: Unable to connect to HDFS: Hdfs::HdfsRpcException: HdfsFailoverException: Failed to invoke RPC call "getFsStats" on server "localhost:8020" Caused by: HdfsNetworkConnectException: Connect to "localhost:8020" failed: (errno: 111) Connection refused: Cannot parse definition from metadata file /var/lib/clickhouse/store/7fe/7fe07a23-cb8e-4134-a5da-b06958e772bf/asynchronous_metric_log.sql

Additional Information:

  • I can access HDFS via the command line with the command: $HADOOP_HOME/bin/hdfs fs -ls /, which successfully lists the directories in the HDFS root directory.
  • The endpoint in the storage_config.xml matches the one provided in the core-site.xml of the HDFS configuration.

Questions

  1. Is there a different default port that HDFS uses instead of 8020?
  2. Are there additional configurations needed in HDFS or ClickHouse to establish a successful connection?
  3. What steps can I take to troubleshoot and resolve the "Connection refused" error?

Steps to Reproduce

  1. Configure storage_config.xml as shown above.
  2. Restart the ClickHouse server.
  3. Observe the error in the ClickHouse server logs.

Environment

  • ClickHouse version: [24.4.1.2088]
  • Hadoop version: [3.3.6]
  • Operating System: [Linux (ubuntu 20.04)]

Any guidance, reference documentation, or suggestions to resolve this issue would be greatly appreciated.

Thank you!

@avj-vaibhav avj-vaibhav added the question Question? label May 16, 2024
@avj-vaibhav avj-vaibhav changed the title How to add HDFS as cold storage for tiered policy ? Unable to Connect to HDFS for Tiered Storage Setup May 16, 2024
@den-crane
Copy link
Contributor

den-crane commented May 16, 2024

HDFS is unusable as a MergeTree disk with the current Clickhouse implementation

@avj-vaibhav
Copy link
Author

Hey @den-crane,
Thank you for the quick response.

After reviewing the mentioned issue, it seems ClickHouse doesn't support HDFS. Is it possible to use a mounted or external disk for 'cold' storage instead?

Any references or documentation on this would be greatly appreciated.
Thank you!

@den-crane
Copy link
Contributor

den-crane commented May 16, 2024

it seems ClickHouse doesn't support HDFS.

right.

Is it possible to use a mounted or external disk for 'cold' storage instead?

HDFS as an mounted/external disk? Never heard about such technology, I doubt it will work.

@avj-vaibhav
Copy link
Author

Hey @den-crane

No not HDFS as mounted disk.
Can i just use any storage disk like HDD or SSD as cold storage, which is mounted to the Clickhouse server machine ?

Thank you!

@den-crane
Copy link
Contributor

Can i just use any storage disk like HDD or SSD as cold storage, which is mounted to the Clickhouse server machine ?

Yes, you can.

@avj-vaibhav
Copy link
Author

Hey @den-crane,

I'm trying to use an external disk mounted to the ClickHouse server machine for cold storage, but I'm encountering the following error:

<Error> Application: std::exception. Code: 1001, type: std::__1::__fs::filesystem::filesystem_error, e.what() = filesystem error: in posix_stat: failed to determine attributes for the specified path: Permission denied [/home/server-user/cold-storage/]

I have verified that the specified path /home/server-user/cold-storage/ has the correct permissions set for server-user. However, when ClickHouse attempts to write data to this path, it throws the above error.

Additional Information:

  • The path does not have root permissions, it is owned by server-user.
  • I have come across several issues suggesting downgrading ClickHouse to version 22.x.x as a potential solution.

Could you please advise on the best approach to resolve this issue? Is downgrading ClickHouse a viable solution, or is there another recommended method to address this permission error?

Thank you!

@alexey-milovidov alexey-milovidov added the st-wontfix Known issue, no plans to fix it currenlty label May 20, 2024
@den-crane
Copy link
Contributor

den-crane commented May 21, 2024

@avj-vaibhav

  1. how do you run Clickhouse-server?
  2. check sudo -u server-user 'mkdir /home/server-user/cold-storage/test'
  3. what is this /cold-storage/ sortof USB disk? what filesystem does it have?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question? st-wontfix Known issue, no plans to fix it currenlty
Projects
None yet
Development

No branches or pull requests

3 participants